Replacement Stack

The "Replacement Stack" is a new feature in Proxomitron Naoko-4. Normally Proxomitron supports ten matching variables \0 to \9. For most things this is fine, but occationally you may find you need something more.

Take for example, using the "+" or "++" expressions to match a repeating run of matches. What if you want to stick each item found into a different variable? For instance - look at this run that matches each section of a URL path...

http://(*/)+*.html

The (*/)+ bit will match for each section of the URL - but how can we capture each one of these sections into a different variable? This is where the replacement stack comes in. It uses the special character "\#" which like "\0" through "\9" stores a matched value. However each time it's called it stores the matched value into a "stack" which can hold up to 100 items. It can then be used in the replacement text to "pop" an item off that stack in a first-in first-out manner. Using this we could write the above match like so...

http://(\#/)+\#.html

Then use a replacement text like...

"\# \# \# \# \# \# \#"

Which would convert this URL...

http://this/is/a/test/of/the/stack.html

Into this...

"this is a test of the stack.html"

Each time the (...)+ run loops it pushes a new value onto the stack, then whatever's left over gets matched by the final "\#". Also, like the other positional variables, the stack variable can be used directly after a set of parenthesis to capture the expression within. For instance...

http://(*/)\#+\#

would produce...

"this/ is/ a/ test/ of/ the/ stack.html"

The replacement text also recognizes another special escape "\@". This just dumps the entire contents of the stack in order (like using \#\#\#\#\#\#\#\#...). In fact, this is probably what you'll use most of the time.

Some typical uses...

Really another way to think of \# is that it's exactly like \1, \2, \3, etc. except that each time it's called, the value is stuck onto the end of whatever's already there rather than just replacing it. Here's some examples...

Remove attributes from a tag:

Bounds:	<Sometag\s*>
Matching:	(\#(attr1\|attr2\|attr3)=$AV(*))+ \#
Replace:	\@

The first "\#" catches the text up to the first attribute to be removed (if any)
Further iterations of the loop catch the text between that and any other
The final \# catches what's left over after the last attribute or the entire tag if no attribute was matched.

Keep only selected attributes (and toss everything else):

Bounds:	<Sometag\s*>
Matching:	(((attr1\|attr2\|attr3)=$AV() )\#)+ *
Replace:	<Sometag \@>

This is the opposite from above, but very similar. It'll throw away everything except those attributes we want to keep around. Here There's only one hash that captures each attribute and value.

Replace attribute values with something else:

Bounds:	<Sometag\s*>
Matching:	(\#((attr1=)\#foo$SET(\#=bar)\| (attr2=)\#black$SET(\#=white)\| (attr3=)\#one$SET(\#=zero)))+ \#
Replace:	\@

This is a bit more complex. Notice it uses the $SET(\#=...) command along with the normal match. This can be used to do some interesting things.

The example above first looks for one of the attributes it's trying to match. When it finds one, it pushes it's name onto the stack and then tries to match the value. if it find the one it's looking for, it uses set to push a new value instead. In addition it also captures and preserves anything else that's not matched. You could even replace the separate attributes with a list call like so...

(\#$LST(AttributeReplace))+ \#

With list items like...

#
# Sample attribute replacement list
#
(attr1=)\#foo$SET(\#=bar)
(attr2=)\#black$SET(\#=white)
(attr3=)\#one$SET(\#=zero)

or even...

#
# Sample attribute replacement list two
#
attr1=foo $SET(\#=attr1=bar )
attr2=black $SET(\#=attr2=white )
attr3=one $SET(\#=attr3=zero )

to selectively replace both the attribute and it's value with something else. (Actually this is also better because the items are "hashable" in the list and can match quicker).

Return to main index