+1 vote
asked by
retagged by

1 Answer

+1 vote
Best answer

About the regular expression processor, there is a common misconception: some people expect just one output column, containing everything that has been matched by the regular expression.

But this processor is actually more powerful:

  • First, it allows to create a column with only part of what has been matched. For instance, if you want to extract the link of a simple HTML tag like `<a href="example.com">`, you could write `<a href="([^"]*)">`. The parentheses are a capture, and designate what you want to extract. In this case, the output column will contain `example.com`.
  • Second, it allows to create several columns at once: simply have several captures in the regexp! Which also means that, confusingly, if there are no capture then there are no created columns.


answered by
974 questions
1,002 answers
2,415 users

┬ęDataiku 2012-2018 - Privacy Policy