CleanUp, a subroutine to thwart hackers
A subroutine is defined via a 'sub' statement.
The arguments to the subroutine are provided in a list which is used when calling
the subroutine.
&CleanUp("this is a sentence");
is a call to CleanUp
with the argument "this is a sentence".
The first task of a subroutine is to obtain these arguments (this argument, "this is a sentence")
and make it locally available inside the subroutine
sub CleanUp
{
my($v,$count,$i);
$v = $_[0];
The sentence $v = $_[0] recovers the first argument passed to the subroutine, and
assigns it to the Perl variable $v.
If there were two sentences (or numbers, or whatever), &CleanUp('this','that') then
we would retrieve
on of them as $v1 = $_[0]; and $v2 = $_[1] for the other.
Then we could use $v1 and $v2 as we saw fit.
The Statement above, my($v,$count,$i) declares variables as local to this
subroutine, so that we can not mix them up with similarly spelled variables in
the other subroutines and the main program.
This is because variables in Perl are global unless declared otherwise.
This means that a variable in the main routine can be referenced in the subroutine.
Since this may not be what you want, it is important to think about local versus
global variables while you are working.
if ( $_[0] ne '')
Here we set up an if statement, so that we will not execute the following code if
the argument $_[0] is blank.
We could have written 'if($v ne '')' just as well.
{
$_[0] =~ s/\\\\//gi;
This sentence replaces every backward slash with nothing.
This is hard stuff, so let's start a little simpler.
The statement
$v =~ s/this/that/
finds the first occurrence of the word 'this'
in $v, and
substitutes the word 'that' for it.
The =~ means substitute the result back into the original variable ($v).
If
$v = "This was a person; this person had this cat"
before execution of the substitution,
$v = "This once was a person; that person had this cat"
would be the result.
Had we written
$v =~ s/this/that/i
instead, notice the 'i' at the end, which stands for ignore case, the result would
have been
$v = "that was a person; this person had this cat"
since the ignore case would have allowed the substitution for "This", while
without the ignore case, this "This" would have been skipped, since it was
not an exact fit.
Had we written
$v =~ s/this/that/g
instead, notice the 'g' at the end, which stands for global, the result would have
been
$v = "This was a person; that person had that cat"
since the global appended tag means find every occurrence, not just the first,
and make the substitution.
Our problem in using CleanUp
is that we wish to remove hacker characters, i.e., characters which a normal student
would not employ in her answer, which a hacker would employ to attempt to gain command
control of our computer.
The first suspect character is the backslash, since this is the characterwhich escapes
the next character, i.e., makes it loose its function as a potential control character,
and instead be intepreted as itself.
What a headache.
$_[0] =~ s/\\\\//gi;
uses the escape character itself (the first \\) to make the second escape character be intepreted
as a backslash rather than an escape character.
So the first part of this statement says, find the backslash.
The second part (//) says replace it with nothing, i.e., remove it.
(/q/) would have changed the backslash to the letter 'q'.
Finally, the appended modifier 'g' says find every occurrence of a backslash
and replace it.
It ignore case modifier is there for no reason, but feels safer.
The next Statement removes all dollar signs ($).
$_[0] =~ s/\$//gi;
The next one removes all pound signs (\#).
$_[0] =~ s/\\#//gi;
The next one removes all tildes (\~).
$_[0] =~ s/\\~//gi;
The final edit removes all carets (\^).
$_[0] =~ s/\\^/**/gi;
What not follows in this subroutine is a set of instructions to count brackets.
In algebra, if the student enters unpaired brackets, she should be warned, since
the linear notation is disconcerting to many of them.
$count = 0;
for ($i=0;$i<length($v);$i++){
#print "<br>i = ",$i," and count = ",$count, "<br> substr = ",substr($v,$i,1);
if(substr($v,$i,1) eq '('){
$count++;
}
elsif(substr($v,$i,1) eq ')'){
$count--;
if($count < 0){print "<br><font color=red>Bracket error at position </font>",$i;}
}
#print "done with loop one time";
}
if ($count != 0){print "<br><font color=red>Bracket error, more of one kind than another</font>";}
}
}
return -1;
Here is the code annotated:
$count = 0;
$count is going to be our counter, i.e., we will increment it when we encounter a '(' in
the student's answer, and we will decrement it when we encounted a ')'.
for ($i=0;$i<length($v);$i++){
This statement sets up a loop, which starts at the '{' and ends at a matching one '}' (below).
It defines a loop variable, $i and sets it equal to zero.
It next defines an exit condition,
$i<length($v)
where length($v) is the length of the string $v, and we are going to proceed
so long as the index $i is less than this length.
Finally, each time through the loop, we will increment the loop variable, i.e.,
$i++.
This last statement is shorthand for $i = $i+1.
Take your pick.
#print "<br>i = ",$i," and count = ",$count, "<br> substr = ",substr($v,$i,1);
This statement is a commented out debugging statement. Remove the pound sign and you will
see what is going on while the loop executes.
if(substr($v,$i,1) eq '('){
Here we set up a condition, asking if a one character long substring of the variable $v is a left bracket
'('?
If it is, we execute the statements enclosed inside the next pair of curly brackets '{' to '}'.
$count++;
}
If we encountered a left bracket, we increment the counter.
elsif(substr($v,$i,1) eq ')'){
$count--;
if($count < 0){print "<br><font color=red>Bracket error at position </font>",$i;}
}
Otherwise, if we find a right bracket, we decrement the counter.
Further, if the counter is less than zero, we have found a bracket error and we report it using
the print statement (which prints on the user's screen).
#print "done with loop one time";
}
if ($count != 0){print "<br><font color=red>Bracket error, more of one kind than another</font>";}
}
First we tested that right brackets never preceeded left brackets, and now we test whether or not
the number of left brackets equals the number of right brackets.
If not, we print out a warning message.
}
return -1;
Finally, every Perl subroutine must end with this return value, which is used often
by the system when 'require' is employed.
More on this later.
Here is where I've stopped editingSorry.