[xquery-talk] functional languages for data processing

daniela florescu dflorescu at me.com
Wed Jun 17 13:07:22 PDT 2015


Ihe,

your question started a long thread, that went beyond your original question: is the lack of 
(greater) adoption of XQuery caused by the fact that it is a functional language ?

I told you my opinion that why this is NOT the case. (I still believe that XQuery wasn’t more successful so far
 simply because there is no money in the XML market… it’s as simple as that.)

In fact, I would go much further, and assert here that a data processing language (the processing being complex and 
the data being complex)  pretty much HAS to be functional in nature.

Here are my three reasons of why I think so:

1. Functional languages have high expressive power (usually Turing complete)
-----------------------------------------------------------------------------------------------------------

Of course, it depends on what the use cases are, but if  you talk about complex data processing, 
at some point you need serious expressive power.

For complex processing you need: recursive functions and hight order functions. 

Note: that’s where SQL fails short. Expressive power is too limited to do complex processing (e.g. serious ETL
on semi-structured data)

2. Functional languages are extremely productive.
-------------------------------------------------------------------

Any good developer who developed serious programs in a well designed functional language will hardly 
come back to the lack of productivity of an imperative language.

I won’t.

E.g. Try to write a simple program that scans a JSON dataset and creates a JSON schema out of it in: JSOniq
and Javascript.  I did it in JSONiq in 2 hours. Not in a million years can the best developer of Javascript do that in 2 hours.

The increase in productivity from an imperative programming language to a functional language is 10X or more, in my
humble experience.

Note: that’s where imperative languages fail short.

3. Functional languages are optimizable.
--------------------------------------------------------

Optimizability (e.g. automatic parallelization, automatic detection and update of indexes, etc) is a fundamental requirement
 for a data processing language.

Data flow analysis is the bread and butter of any query optimizer. 

Data flow analysis is easily doable on a functional language. It’s almost impossible on an imperative language.

Note: again, that’s where imperative languages fail short.


4. Functional languages are easier to generate automatically by tools
——————————————————————————————————————————————

In 2015 we shouldn’t be programming as much from scratch, as we should rely more and more on high level tools
to generate the programs for us.

It’s much easer to make a tool generate a functional program rather then an imperative one. 

E.G. try to write a graphical mapping tool, where you drag and drop lines between two schemas and automatically
the data conversion program gets automatically generated. I can generate JSOniq easily. I am not sure I can generate
Javascript easily.

Note: again, here the imperative programming languages fail short.


================


Now, not everything is pink in the functional programming languages land. The main disadvantages I saw in my life are:

1. Yes, they requires higher programming skills.

2. Debugging is much more difficult. (but that’s true for SQL too)

3. In general programs are much more terse, and hence, harder to read, and hence, harder to maintain. 


But overall, the advantages will win over the disadvantages in the long term, I think.


In fact, honestly, I don’t see any other solution to complex data processing of large volume of complex data. 

But more then happy to explore the possible other options….


Best regards
Dana















More information about the talk mailing list