[xquery-talk] Does XQuery fit anywhere in this landscape.

Ihe Onwuka ihe.onwuka at gmail.com
Sat Jun 27 06:51:25 PDT 2015


Below is an blog post from Norm Matloff the author of the Art of R -  a
statistician that lives in a CS department at UC Irvine, the book I
referred to in another post,

http://blog.revolutionanalytics.com/2014/08/statistics-losing-ground-to-cs-losing-image-among-students.html

The article is interesting for a number of reasons not the least the
parallel of it's core theme,  the image problem of a discipline that is
perceived to be unfashionable. The reference  to a CS usurpation problem is
ironic because the reverse argument could also be made - really bad CS (e.g
data management) being entrenched as standard by statisticians (and the
like) in the name of data science - just goes to show that there are
probably 2 very valid sides to that coin.

Enough preamble, I am wondering if the perusal of some of the comments
reveal an opportunity.

Tom 26/8/14 @ 23.25 surmising that CS students are turned off of Stats
classes because of the use of R which "*as a programming language it is
horrible and needs to die in a fire"  *I was hoping  to see a reasoned
rebuttal of a viewpoint I share but Matloff really didn't deal with it well.

There is another comment by Jaipelai 27/8/14 @ 09:42 that I could almost
have written myself.

Point being Stonebraker identifies that people will want to do analytics
with their query languages but all the analytics tools suck at data
management. That market is all the rage now, but the  R and Python
communities are probably lost causes.

So rather than bring analytics to a query language, suppose instead one
looked at baking in  a best of breed query capability to a an analytics
language that was fashionable, functional and comprehension friendly -
Julia.

On Tue, Jun 23, 2015 at 12:52 PM, daniela florescu <dflorescu at me.com> wrote:

>
> On Jun 23, 2015, at 9:14 AM, Ihe Onwuka <ihe.onwuka at gmail.com> wrote:
>
> Well he didn't comment on SQL for JSON per se but  saying that RDBMS are
> sub-optimal for everything is a tacit repudiation of SQL is it not?
>
>
> No, because he said exploitively that the *internals* of a database will
> be different (columnar, main memory, streaming, etc)….. the
> programming language will STILL be SQL.  Or at least for all those
> databases for whom the data model is STILL relational.
>
>
> He buys into the notion that there will be swarms of data scientists doing
> clever things with data which will need a different language.
>
>
> Yes. SQL clearly doesn’t solve the R use cases. So yes, R is on the
> “acceptable OTHER languages” list.
>
> But that’s not clear that what we (aka the XML community see) as “normal”
> data processing use cases will be considered necessary use cases
> for the JSON/NoSQL community.
>
> E.g. scanning the data and  automatically extracting a schema. Is this an
> acceptable use case for JSON ? Or not ?
>
> If yes, then XQuery has a chance, because XQuery can do that and SQL
> cannot.
>
> If no, people will stick to what they know : SQL.
>
>
> He is right that statistical packages suck at data management but that
> won't isn't going to deter the R community.
>
>
> Yes, the R implementations (I looked at them in details about 2 years ago)
> have NO IDEA about how to deal with large volumes
> of data, so probably a mix between data technologies and database
> technologies is necessary.
>
> However, don’t underestimate companies like Oracle. They are not dummies,
> and the know what the market wants.
> R is supported natively inside the Oracle database now.
>
> I think that Stonebreaker exaggerates when he says that relational
> databases will disappear in 10 years. Well… I don’t think
> this will happen so quickly.
>
>
> Do you see XQuery fitting anywhere in this vision. It has potential as a
> pipeling technology as does for that matter SQL. I think it will always be
> problematic to do analytics on the source data because it is too dirty.
>
>
> XQuery COULD be a very good “glue” language between data in various
> formats (CSV, Excel, PDF, HTML, XML, JSON, relational, whatever).
>
> But I say “COULD” not “CAN”.
>
> It needs many extensions to be good at that: scripting, support for JSON,
> modules to support a variety of data formats and  data processing services.
>
>
> Best regards
> Dana
>
>
>
> P.S.
>
> I am continually surprised that people this smart believe that there is
> such a pool of  data scientists people to draw from.
>
>
> Me too. I fell down from my chair when I saw the article saying that US
> needs 5 million data scientists in the next 2 years, aka, about 5% of the
> US working population. Not sure if this for laughing, or for crying.
>
> [[ aka, we will not have cashiers at Safeway anymore ‘cause they are all
> data scientists….]]
>
> Someone up there doing the math in this article doesn’t understand jack
> nothing about numbers and statistics …….
>
> And all this while:
>
> http://www.nature.com/news/irreproducible-biology-research-costs-put-at-28-billion-per-year-1.17711?utm_content=buffer95bfb&utm_medium=social&utm_source=linkedin.com&utm_campaign=buffer
>
> God knows how many medicines are wrongly given to sick people, because
> nobody knows how to do a proper case study …
> REALLY scary … but that’s another discussion.
>
> Again the same discussion comes up: DON”T look for 5 million data
> scientists. Just do with a smaller number of smart ones, but GIVE
> THEM BETTER TOOLS and AUTOMATIZE THE PROCESS.
>
> But hey, how can you stop such a wold wide enthusiasm for “data
> scientists”  !?? Logic doesn’t do it….
>
>
>
>
> On Tue, Jun 23, 2015 at 11:51 AM, daniela florescu <dflorescu at me.com>
> wrote:
>
>> Ihe,
>>
>>
>> I had discussions with Michael Stonebreaker for 20 years about about the
>> fact that
>> XML “exists” or not. With Jim Gray too, before he disappeared. They were
>> both extremely
>> supportive for me, yet were both thinking that I am crazy to waste my
>> research career on XML.
>>
>> Stonebreaker’s  opinion: he doesn’t believe that XML “exists” in industry.
>>
>> So he will not mention it, because it doesn’t exist :-)
>>
>> But you have to remember that Stonebreaker is a database person. Probably
>> he will not
>> understand the facet of XML which is “XML as documents”. It took me and
>> the other database
>> people involved in XQuery years before we swallowed it. (Don Chamberlin
>> of SQL fame
>> famously once said “who in the world would care about such a corner case
>> as mixed content !?").
>>
>> Don’t blame the database people that they don’t “get” XML. On one hand,
>> it has never been explained
>> to them properly.
>>
>> And again, Stonebreaker, being a database person, he will look at “XML as
>> data” aspect of the story.
>> And this today is INDEED non-existing in industry, or almost. Or, when t
>> is, it is mostly for log analysis.
>>
>> ============
>>
>> JSON will completely change the landscape, in surprising ways, that none
>> of us can predict.
>>
>> And no, I trust that Michael Stonebreaker is too smart to believe that
>> SQL is a solution to process JSON.
>>
>> But time will tell.
>>
>> Best regards
>> Dana
>>
>>
>>
>>
>>
>> On Jun 23, 2015, at 12:15 AM, Ihe Onwuka <ihe.onwuka at gmail.com> wrote:
>>
>> https://www.youtube.com/watch?v=9K0SWs1mOD0
>>
>> By implication it puts the kibosh on SQL as the basis of a solution for
>>  the future.
>> _______________________________________________
>> talk at x-query.com
>> http://x-query.com/mailman/listinfo/talk
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://x-query.com/pipermail/talk/attachments/20150627/be939cb4/attachment.html>


More information about the talk mailing list