GROUP Part 3 – Nested GROUP and LOOP

In this final post on GROUP, we will review nested GROUP commands and LOOP. The prior posts here and here provided a basic overview of GROUP and then discussed how to implement GROUP IF.

Example 5 – Nested GROUP

Nested GROUP commands become very handy once you understand how to use GROUP IF. By nesting GROUP IF commands, you can control the flow of the script’s executing logic more efficiently and transparently than would otherwise be possible. I use GROUP IF exclusively as logic controllers and treat them as IF blocks in my code (a feature I wish very much was available outside of GROUP).

To better understand my previous statement, let’s look at the script from example 3:

005 ex 5_01.png

This script is functional – we know this because we walked through it in detail – but there is something about this script that bothers me at a very deep level, and that is code duplication. In this GROUP block we typed out the exact same EXTRACT command and v_Cat assignment twice. Now, if we need to make updates (say, add another field to EXTRACT), we have to do it in 2 different places. If we forget to update one of the commands, the output table will be garbled and useless. Any time we duplicate code we are creating potential maintenance nightmares in the future. Not to mention that it requires mindless effort. I hate repetitive tasks – if I didn’t, I would not spend so much time learning how to automate my job into oblivion so I can focus on new and exciting things.

So, there has to be a better way, and that way is to nest GROUP. By nesting GROUP, we can separately identify when the category changes and reset our running total in its own little block of code. Take a look at the following script:

005 ex 5_02.png

Notice that we have removed the IF statement from the initial GROUP command – that command will now run in its entirety on each record. We have moved the IF command to a nested GROUP block, which will be the first thing to execute on each record. Now we can get rid of the duplicated EXTRACT command and v_Cat assignment. Let’s see how this command executes. We start with the primed variables.

005 ex 5_03

Now the main GROUP block begins executing.

005 ex 5_04.png

The first command is the nested GROUP IF command, which determines if v_Cat is a different value from the current record’s Category field. In this first run it is, so IF evaluates to True and executes the code in the nested GROUP block.

005 ex 5_05.png

The nested GROUP does only 1 thing – it resets the running total to 0.

005 ex 5_06

The nested GROUP then exits and the remainder of the main GROUP executes by adding the current record’s value to the running total, extracting that value, and resetting the v_Cat variable.

005 ex 5_07

We have reached the end of the main GROUP block and there are more records to process, so GROUP begins again by immediately evaluating the nested GROUP IF command.

005 ex 5_08.png

GROUP IF evaluates False, so its code is skipped and the remainder of the main GROUP executes.

005 ex 5_09

The next record is iterated and GROUP IF is evaluated…

005 ex 5_10

…which returns False. The remainder of the main GROUP block is executed.

005 ex 5_11

ACL iterates to the next record. On this iteration we have come to a new category, so GROUP IF evaluates True and executes its code…

005 ex 5_12.png

…which resets the running total to 0.

005 ex 5_13.png

The remainder of the main GROUP command finishes executing.

005 ex 5_14.png

The remaining records execute as we have already seen. When these nested GROUP commands have finished executing, we will be left with a table that is identical to the end result in example 3.

Example 6 – LOOP

Now we have come to the final example. LOOP is a command that can only be used inside GROUP and which allows us to execute the same set of commands multiple times for each record. Some of you may have familiarity with other scripting or programming languages which have looping commands. The idea is the same, but the usage of ACL’s LOOP command is restricted to a very specific scope – it can only be used in the context of a table that is being processed by a GROUP command. You cannot use this command to create looping code elsewhere in your scripts. To do that, you need to use DO…WHILE. Do not spend your time, like I did when first learning the command, trying to use LOOP to replicate the looping commands found in other languages.

In this example we are going to extract each individual tag contained in our sample transaction table into its own record. The script which will accomplish this, and our sample data, is below.

005 ex 6_01.png

The GROUP command starts by calculating how many tags are in the current record’s tag field. In our sample table, the tag field can contain multiple tags which are separated by commas. The OCCURS function will detect how many commas are in the field. We add 1 to this result because there will always be one more tag than there are commas.

005 ex 6_02.png

Next, we initialize a counter which we will use to loop through the number of tags.

005 ex 6_03.png

Now LOOP begins, and it begins by determining whether our counter is at a valid tag. We have 1 tag, and our counter is set to tag 1, so LOOP can execute.

005 ex 6_04.png

LOOP starts by calculating the value of the tag in the first position using the SPLIT function.

005 ex 6_05.png

The calculated tag is then extracted along with other relevant data from the record.

005 ex 6_06.png

The final command increments the counter by a value of 1.  This is a very crucial line of code – without it LOOP becomes an infinite loop.  If you are like me, you will spend so much time focusing on the rest of the GROUP/LOOP logic that you will occasionally forget to add this to the end of the block.  And when you finally get around to cancelling the script because it is taking too long, you will have a table with hundreds of thousands (or millions, if you got up to get a cup of coffee) of records all containing the same data.  If you leave the script unattended, it will create records until your hard drive is full (assuming you have disabled ACL’s loop limit protections, which I do).  It is probably a good idea to stay at your computer when testing LOOP.

005 ex 6_07.png

The end of the LOOP is reached, which causes LOOP to start again by evaluating its WHILE condition.

005 ex 6_08.png

Now, the current tag as represented by the counter is invalid – according to v_Tags there is only 1 tag, not 2. The WHILE condition returns False and LOOP stops executing. This takes us to the end of the GROUP command, which iterates the next record.

Side note: Unlike other languages which can allow the loop condition to be evaluated either before or after the loop code, the LOOP command always evaluates the WHILE condition before the LOOP code executes.

005 ex 6_09.png

Now that we are back at the beginning of GROUP, the number of tags is recalculated and the counter is re-initialized to 1.

005 ex 6_10.png

LOOP executes again. This record only has 1 tag, so its execution is identical to the first record.

005 ex 6_11

We are finished processing this record, so it is time to move to the next record.

005 ex 6_12.png

This record is also processed in a similar manner because it only has one tag. We will skip the details and move along.

005 ex 6_13.png

Now we have iterated to a record with multiple tags.

005 ex 6_14.png

This record begins, as with all of the prior records, by calculating the number of tags and re-initializing the counter to 1.

005 ex 6_15.png

The LOOP block executes by calculating the first tag, which is ‘A’, extracting this tag and other record information to the result table, and incrementing the counter.

005 ex 6_16

LOOP begins again by evaluating the WHILE condition, which is still true because our counter has not exceeded the number of tags in the record.

005 ex 6_17.png

The code in the LOOP block now executes by calculating the second tag (‘B’), extracting it to the result table, and incrementing the counter.

005 ex 6_18.png

LOOP begins again by evaluating the WHILE condition, which is still true.

005 ex 6_19.png

The LOOP code now executes for the third tag (‘E’).

005 ex 6_20

LOOP is finished for this record – the counter has finally exceeded the number of tags. GROUP begins again on the next record.

005 ex 6_21.png

The combination of GROUP and LOOP continue executing in this manner for the remaining 2 records. By the time they have finished processing this table the result looks like this:

005 ex 6_22.png

LOOP can be a powerful tool in your analytical toolbox to perform tasks which are otherwise difficult to perform. Some examples of how I have used LOOP include:

  • Extracting regular expression pattern matches out of fields of free-form text.
  • Extracting comments from a table imported from ACL GRC.
  • Associating formal names and nicknames to each other from a file containing delimited name associations (if you were at ACL Connections 2015 and attended the session on regular expressions, you may remember me talking about this briefly).
  • I have a fairly complex normalization script that, among other things, will optionally sort the individual characters in a field by alphabetical order. I do this using nested LOOP statements to iterate through each character of the source string and plop it into the proper position of the output string. The result is pretty cool, and after innumerable tests and failures, I was quite excited when I finally got it to work.

This brings our discussion of GROUP to a close. It is my sincere hope that it has provided clarity about the command and gives you the knowledge and confidence to use it in your own scripts. As a parting gift, you can download a non-Unicode project with the sample data and example scripts here.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s