Create an ezParse Rule using simple Regular Exressions

  1. Select Options from the Tools menu and select ezParse from the list displayed on the screen.

  2. From the File groups list control, select Text Based Files.  Select TXT from the File extension list, then select Add from the Rules section of this dialog panel . Type ID Based Rule when requested to do so and then press OK.

    ezparse_adv_rule1.bmp

  3. Now that you’ve created a Rule called ID Based Rule, double-click it, or highlight the rule and select Edit Rule.

  4. In the resulting dialog, we want to enter a value for the Start Tag and one for the End Tag.

    The Start Tag expression we wish to add is ^[^= ]* = " (see below for an explanation)
    The End Tag should be "
     

    Syntax

    Description

    ^

    This is the regular expression syntax for the beginning of a segment

    [^= ]

    The square brackets indicate a range of characters.  The ^ inside the square brackets indicate a range of excluded characters. i.e. read this as any character excluding an equals or a space.

    [^= ]*

    The asterisk indicates any number of times.  i.e. read this as any number of characters excluding an equals or a space.

     = "

    This matches simple text space, equals, space, followed by a double quote.

  5. In the File Preview section, browse to the file IDBasedFiles.txt and press Preview.

  6. The pink colour code indicates a section of text identified as the Start Tag.  The green colour code indicates the localizable text and the yellow colour code indicates the End Tag.

     
    ezparse_adv_rule2.bmp
     

  7. The next step is to set the ID.  The pink colour coded text contains the ID.  To indicate to CATALYST which part of the StartTag is the ID, click on the Complete Regular Expression

    ezparse_adv_rule3.bmp

  8. In the resulting dialog, check the option 'Segments Have IDs' and cycle through the numbers to see the effect of changing this value.  The ID can be anywhere in the complete Regular Expression.

    i.e. it could be
    part of the Start Tag,
    part of the Localizable Text,
    part of the End Tag.



    ezparse_adv_rule6.bmp

     

  9. What we need to do is introduce another pair of braces surrounding the ID within the Start Tag, i.e. ^([^= ]*) = "

  10. With this new pair in place, identifying the group as 2, highlights the correct section of the Regular Expression as the ID.

    .   ezparse_adv_rule4.bmp

  11. Press OK to exit the Edit Method Advanced Settings dialog

  12. Press Preview to ensure that rule has behaved correctly.  The purple colour coding indicates the piece of the segment that has been identified as the ID

    ezparse_adv_rule5.bmp

     

    note.bmp

    The preview of the original file is colour coded to help debug ezParse rules. A different colour is used for each element in a matching rule so you can easily spot when rules mismatch content in your file.
     

    Purple coded text represents the ID

    Pink colour indicates the Start Tag

    Green is the localizable text

    Yellow is the End Tag

 

The ezParse rule is now complete and can be used to extract text from any file with a similar format.  Press OK to close the Edit Methods dialog and OK again to save the rule on your machine.