+1 vote
retagged by

1 Answer

+1 vote
Best answer

About the regular expression processor, there is a common misconception: some people expect just one output column, containing everything that has been matched by the regular expression.

But this processor is actually more powerful:

  • First, it allows to create a column with only part of what has been matched. For instance, if you want to extract the link of a simple HTML tag like `<a href="example.com">`, you could write `<a href="([^"]*)">`. The parentheses are a capture, and designate what you want to extract. In this case, the output column will contain `example.com`.
  • Second, it allows to create several columns at once: simply have several captures in the regexp! Which also means that, confusingly, if there are no capture then there are no created columns.


1,299 questions
1,327 answers
11,867 users

©Dataiku 2012-2018 - Privacy Policy