Pages

Thursday 2 August 2012

Some basic string manipulation in OBIEE.

Manipulating strings using SUBSTR

To extract parts of a string of characters like prefixes or file extensions (as an example) we involve a few of the base string functions available in OBIEE.

So what basic features do we need to know about strings, or what basic functionality do we need to manipulate strings.

SUBSTRING

To extract a substring from a string we need to be able to determine several things about the string.
1. Its overall length
2. If a character or sequence of characters is in the string.
3. the start position of a sequence of one or more characters within the string.

So how long is a string?

We can use the LENGTH function to return the number of characters in the string.

We have the LOCATE function to determine the position of a particular string within the string.

Used together they will allow us to extract the second word from the string 'one two three'.

Let's do a worked example

Start a new Analysis and select any column. We will change the contents as we go. Select Edit formula

and edit the column formula to contain 'one two three' (the single quotes are important as they tell OBIEE this is a string).










Add another column and this time we will edit the formula to show the length of the first column. Select the dropdown on the new column and along the bottom of the Column formula panel you will find a column button with a down chevron, click this and select 'one two three'. Highlight 'one two three' then using the f(...) button at the bottom left we will expand the string functions and select LENGTH.

LENGTH('one two three')

Similarly we will now add columns for finding the first occurence of space ' '.

LOCATE(' ', 'one two three')

and another for the next space

LOCATE(' ', 'one two three', LOCATE(' ', 'one two three')+1)
and one to extract the second word, (remember the word starts one character after the first Locate).

SUBSTRING('one two three' FROM LOCATE(' ', 'one two three')+1 FOR (LOCATE(' ', 'one two three', LOCATE(' ', 'one two three')+1)-LOCATE(' ', 'one two three')))

The SUBSTR has the following syntax SUBSTR(string FROM start FOR number). So we can fill in the blanks
  • string is obviously 'one two three'
  • start is LOCATE(' ', 'one two three')+1
  • and number is the length of the word which we can calculate as (the second occurence of space) minus (the first occurence of space + 1)

and finally a test of the length of the second word, just make sure.

LENGTH(SUBSTRING('one two three' FROM LOCATE(' ', 'one two three')+1 FOR (LOCATE(' ', 'one two three', LOCATE(' ', 'one two three')+1)-LOCATE(' ', 'one two three'))))





1 comment:

  1. Hi. Have a field with name date (i.e. Mark Jones, Mark, Mark A Jones, Jones Mark, etc.). Spaces separate names in field. Want to have a separate field for the first occurrence (i.e. Mark) and a separate field for the second occurrence (i.e. Jones), and even the third occurrence if middle name or initial is present (i.e. A). Some of the field values have a full name and some have just one or two values, and this is what makes the extraction difficult because of the spaces. When I try examples in above posts, I get the "...The length operand 0 of the Substring operator must be greater than zero.". ANy ideas for a solution to this?

    ReplyDelete