[xquery-talk] Doing some Pattern Frequency Distribution

Fri Jun 9 09:21:54 PDT 2006

> > Incidentally, XML databases vary greatly in their ability to handle 
> > large documents. Many are optimized to handle large numbers 
> of small 
> > documents, and behave rather poorly when asked to deal with small 
> > numbers of gigabyte-sized documents.
> 
> This might actually qualify as a distinction between native 
> and not so native XML databases. In a real, native XML 
> database it should not make much of a difference

It's certainly an interesting property of an XML database what kind of
documents it is optimized for, but let's not overload the adjective "native"
with yet another meaning. The term has become meaningless enough already
through overloading.

Let's also avoid treating the question as if there were one right answer
("should"). That's like saying that relational databases "should" be able to
handle tables with a billion columns. It's a vendor decision whether they
consider that use case important.

If there were an agreed approach to data modelling for XML databases, for
example if everyone agreed that you should put all your data in one
document, then you could make such a case. But there isn't. It's quite
plausible to argue that you "should" allocate one document for each business
object.

Michael Kay
http://www.saxonica.com/