cfRegeX


Split

The split action allows you to convert a string into an array, treating regex matches as delimiters. In effect, this is the opposite of Match, which creates an array of the matches; whilst Split populates the array with the text that does not match the regex.

You can use start to specify a character position before which no splitting takes place, and use limit to specify the maximum number of times the text should be split.

If you have complicated splitting conditions, you can pass in a callback function which will be called each time a match is found, and the split will only occur if the function returns true.

Object

Arguments

Name Type Required Default Notes
Text String yes n/a The text which is to be split by the regex.
Start Char Position no 1 Position at which to start splitting (1 is first character.)
Limit Integer no 0 Number of times to split before stopping. (0 is unlimited.)
Callback Function no none A function called each time a match is made. If function returns false the split does not occur (and does not count towards limit). See Callbacks section for full details on function signature and how to use this feature.
CallbackData Struct no none A structure which is passed into the callback function.

Usage Examples

<cfset Input = "The quick fox jumps over the lazy brown dog." />
<cfset WhitespaceRx = new Regex( '\s+' ) />
<cfset WordRx = new Regex( '\w+' ) />


<cfdump var=#WhitespaceRx.split( Input )# />
Outputs: ['The','quick','fox','jumps','over','the','lazy','brown','dog.']

<cfdump var=#WhitespaceRx.split( Input , 5 )# />
Outputs: ['The quick','fox','jumps','over','the','lazy','brown','dog.']

<cfdump var=#WhitespaceRx.split( Input , 5 , 3 )# />
Outputs: ['The quick','fox','jumps','over the lazy brown dog.']

<cfdump var=#WordRx.split( text=Input , callback=checkWord )# />

<cffunction name="checkWord" returntype="Boolean" output="false">
    <cfargument name="Match" type="String" />
    <cfreturn Len(Arguments.Match) GTE 4 />
</cffunction>
Outputs: ['The ',' fox ',' ',' the ',' ',' dog.']

<cfdump var=#WordRx.split( text=Input , callback=checkWord , limit=3 )# />

<cffunction name="checkWord" returntype="Boolean" output="false">
    <cfargument name="Match" type="String" />
    <cfreturn Len(Arguments.Match) GTE 4 />
</cffunction>
Outputs: ['The ',' fox ',' ',' the lazy brown dog.']

Tag

Attributes

Name Type Required Default Notes
Variable VarName no "cfregex" The variable which the result is assigned to.
Text String yes n/a The text which is to be split by the regex.
Start Char Position no 1 Position at which to start splitting (1 is first character.)
Limit Integer no 0 Number of times to split before stopping. (0 is unlimited.)
Callback Function no none A function called each time a match is made. If function returns false the split does not occur (and does not count towards limit). See Callbacks section for full details on function signature and how to use this feature.
CallbackData Struct no none A structure which is passed into the callback function.
Modes StringList no none List of regex modes to apply to the pattern.

Usage Examples

<cfset Input = "The quick fox jumps over the lazy brown dog." />


<cfregex split variable="Output" text=#Input# >
    \s+
</cfregex>
<dump var=#Output#/>
Outputs: ['The','quick','fox','jumps','over','the','lazy','brown','dog.']

<cfregex split variable="Output" text=#Input# start=5 >
    \s+
</cfregex>
<dump var=#Output#/>
Outputs: ['The quick','fox','jumps','over','the','lazy','brown','dog.']

<cfregex split variable="Output" text=#Input# start=5 limit=3 >
    \s+
</cfregex>
<dump var=#Output#/>
Outputs: ['The quick','fox','jumps','over the lazy brown dog.']

<cfregex split variable="Output" text=#Input# callback=#checkWord# >
    \w+
</cfregex>
<dump var=#Output#/>

<cffunction name="checkWord" returntype="Boolean" output="false">
    <cfargument name="Match" type="String" />
    <cfreturn Len(Arguments.Match) GTE 4 />
</cffunction>
Outputs: ['The ',' fox ',' ',' the ',' ',' dog.']

<cfregex split variable="Output" text=#Input# callback=#checkWord# limit=3 >
    \w+
</cfregex>
<dump var=#Output#/>

<cffunction name="checkWord" returntype="Boolean" output="false">
    <cfargument name="Match" type="String" />
    <cfreturn Len(Arguments.Match) GTE 4 />
</cffunction>
Outputs: ['The ',' fox ',' ',' the lazy brown dog.']

Function

Arguments

Name Type Required Default Notes
Pattern RegexString yes n/a The regex pattern to compile into a Regex Object.
Text String yes n/a The text which is to be split by the regex.
Start Char Position no 1 Position at which to start splitting (1 is first character.)
Limit Integer no 0 Number of times to split before stopping. (0 is unlimited.)
Callback Function no none A function called each time a match is made. If function returns false the split does not occur (and does not count towards limit). See Callbacks section for full details on function signature and how to use this feature.
CallbackData Struct no none A structure which is passed into the callback function.
Modes StringList no none List of regex modes to apply to the pattern.

Usage Examples

<cfset Input = "The quick fox jumps over the lazy brown dog." />

<cfdump var=#RegexSplit( '\s+' , Input )# />
Outputs: ['The','quick','fox','jumps','over','the','lazy','brown','dog.']

<cfdump var=#RegexSplit( '\s+' , Input , 5 )# />
Outputs: ['The quick','fox','jumps','over','the','lazy','brown','dog.']

<cfdump var=#RegexSplit( '\s+' , Input , 5 , 3 )# />
Outputs: ['The quick','fox','jumps','over the lazy brown dog.']

<cfdump var=#RegexSplit( pattern='\w+' , text=Input , callback=checkWord )# />

<cffunction name="checkWord" returntype="Boolean" output="false">
    <cfargument name="Match" type="String" />
    <cfreturn Len(Arguments.Match) GTE 4 />
</cffunction>
Outputs: ['The ',' fox ',' ',' the ',' ',' dog.']

<cfdump var=#RegexSplit( pattern='\w+' , text=Input , callback=checkWord , limit=3 )# />

<cffunction name="checkWord" returntype="Boolean" output="false">
    <cfargument name="Match" type="String" />
    <cfreturn Len(Arguments.Match) GTE 4 />
</cffunction>
Outputs: ['The ',' fox ',' ',' the lazy brown dog.']

Practical Examples

Example 1

Split on a comma that has not been escaped with a backslash:

<cfregex
    action   = "split"
    variable = "OutputArray"
    text     = #InputString#
    >
    ## lookbehind for either...
    (?<=
        ## not a backslash
        [^\\]
    |
        ## or, an EVEN number of backslashes
        ## (an odd number would be escaped)
        (?<!\\)(?:\\\\){1,10}
    )
    ## followed by a comma
    ,
</cfregex>