Figure 1. Patched SetProperty function
Figure 3. Expression propertiesWe will only focus on the following properties, as they are relevant for this analysis:
- Input - the source string
- lastMatch - the match string
- $1-$9 - the sub-match string.
- 0x24: pInputString (the source string)
- 0x28: startIndex (beginning of the lastMatch string in the pInputString's index)
- 0x2c: length (the lastMatch string length)
- 0x30: isNeedUpdate (whether or not to update the related field)
- 0x40: pUnmatchString
- 0x44: pUnSearchString
- 0x48: pLastMatchString (the address of the matching string)
- 0x4c: $1 (the submatch 1 string)
- 0x00: pVtable
- 0x08: length (the string length)
- 0x0c: pString (pointer to the string array's starting address)Demonstration
This is the contents of the memory after re.exec has finished running:
var src = "Please send mail to firstname.lastname@example.org and email@example.com. Thanks!"; pattern = “(george)”; var re = new RegExp( pattern ); re.exec( src ); alert(RegExp.$1);
Figure 4. Memory contents after re.exec commandpInputString points to the string "Please send mail...". lastMatch.startIndex is 0x14, and lastMatch.length is 0x06. isNeedUpdate is True. The other fields still have their initial value. For example, $1 is set to the null string. Only the three fields noted above were modified. After alert(RegExp.$1) has run, the contents of the memory are as follows:
Figure 5. Memory contents after alert(RegExp.$1)When we reference RegExp.$1, it will call GetProperty like so:
- It checks if isNeedUpdate is false. If yes, the update is finished, so the function will return. Otherwise, it will go to step #2 to update the file.
- EnsureValues first uses UnifiedRegex::RegexPattern::GetGroup to get $1-related information. It return startIndex, length, and startIndex. Representation $1 begin address in the pInputString, length is the $1 length.
- Call Js::SubString::New to create Js::SingleCharString. The parameters of this are pInputString, startIndex, and length. The return of this function is $1_address.
var src = "Please send mail to firstname.lastname@example.org and email@example.com. Thanks!"; pattern = “(george)”; var re = new RegExp( pattern ); re.exec( src ); RegExp.input = "123456" alert(RegExp.$1);
- When regexp.exec finishes, the function just stores the inputString (we call this right_inputString) and the lastmatch string’s index and length.
- When we set RegExp.input, RegExp just modified the pInputString field (we call this error_inputString), and didn't modify the other fields.
- When we use RegExp.$1 and read the property, RegExp will use the inputString(error_inputString) field to compute the other fields.
- If the error_inputString length is smaller than the $1.startIndex+$1.length, when we reference RegExp.$1, it can read error_inputString out of bounds.