Regular Expression General Input Validator
Regular expressions can do a lot to make user input from forms more secure. With JavaScript, it's not hard to make decent routines that will filter input. What's infinitely harder is to find a way to force users to keep JavaScript turned on so these routines will be used and the data filtered! Rots 'o ruck making users do anything! About 5% avoid enabling JavaScript in order to avoid JavaScript malware routines that spring upon them like ravening beasts when they hit the landing page of some creep's predatory website. Happily, PHP is on host's servers and cannot be so lightly dismissed by users. Regular expressions user input validation can be forced upon user data at this point, for everyone's benefit—except predatory hackers. Good PHP input filtering brings would-be hackers to tears. Kind of touching, isn't it?
Anyway, we created a special page where you can learn more about security levels from JavaScript and PHP input filtering, and why it IS still a good idea to use JavaScript input filtering with validators.
In the code below, you'll see how one can deal with validation in several ways. If you'd like to try out our regular-expression-general-input-validator, use the link below:
regular-expression-general-input-validator.htm
The first filtering method is the trusty JavaScript alert, in which you inform the user that you're serious about him or her typing only acceptable characters, and you remind him or her what characters are okay. The function check() will make sure they have 1 to 50 characters—that's the {1,50} part of the regular expression. We allow hyphens, but 2 in a row is dangerous, so we run the p=p.replace(/--/g," -") routine twice, since once occasionally leaves 2 in a row. Allowing hyphens is good since there are tons of legitimate uses for them. In the regular expression character class [A-Za-z0-9! @\,\.\?_-], we allow uppercase and lowercase letters, numbers, exclamation marks, spaces, @ signs, commas, periods, question marks, underscores, and hyphens. if (document.form.generalinput.value.search(ck_general)==-1) looks at the contents of the filled out form input field and if it does not match the regular expression variable ck_general, the ==-1 (meaning false or failure) will be what the if statement finds, and the chastising alert will spring up and shame them without mercy. (Okay—not so much.) By the way, for testing here we used {alert("Your general input validated OK.");
d.generalinput.value='';d.generalinput.focus();return false}} for good input, but for real use, change all that to {return true}}. There's no point in an OK message. The return true will cause the form to submit. So make sure the action=
" " in the form tag gets a better action script than " ".
The second filtering method is the trusty fix-it-by-editing-it method, and it uses function check_edit(), in which you delete all unacceptable characters by replacing them with an empty string "". Note that in p=p.replace(/[^A-Za-z0-9! @\,\.\?_-]/g,""), the regular expression character class has a ^ at the beginning of the expression. This means negation. So all characters that are NOT in this character class will get replaced by the "". Also note that we edit all instances of 2 hyphens in a row until there's only one—for safety. Finally, check out what happens if they just hit submit with no content. The length will be 0 so we change the content to
" " instead—a space. However invisible, it's at least legitimate. Perhaps the user wishes to fill in better data at a later time. Of course, if you want to give an alert or type in N.A. or change the content to something else, that's fine. We let the user see the content after editing, by use of a confirm box, and s/he gets to veto it or give a hearty thumbs up. Notice that {alert("Accepted"); d.generalinput.value='';d.generalinput.focus();} should be replaced with {return true} when you actually use the code somewhere.
The third filtering method is the trusty escape-all-special-characters method, and it uses function check_escape(), in which you use JavaScript's old reliable escape() and unescape() functions to encode a string, which makes a string portable, so it can be transmitted across any network to any computer that supports ASCII characters. The encoding turns the special characters (except * @ - _ + . /) into ASCII tokens. And the unescape() function reverts any ASCII tokens back to regular characters. But escape() doesn't do / ? = & @ + - so we escaped them separately. And if the user happened to paste in characters above ASCII 127, these are escaped as well, though they're not on keyboards—which explains why we said "paste in," not type in. By the way, change the return false;} at the end of the function to return true;} for real use.
No JavaScript filtering method can be counted on to get the job done, because 5% have JavaScript disabled for safety and a few others have it disabled for hacking, since our validation routines cannot run if JavaScript is not enabled. Don't forget to check out security levels from JavaScript and PHP input filtering for ideas and info about all this stuff.
You might wish to scope out our final form. There are 2 input boxes that get filled by the function check_escape(), not by the user. So, d.generalinputescaped.value=p; and p=unescape(p); and d.generalinputunescaped.value=p; are meant to show the before and after. The first is the escaped string and the second is after it gets turned from ASCII tokens to regular characters. The actual input box names are, respectively, generalinputescaped and generalinputunescaped.
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=windows-1252">
<TITLE>Regular Expression General Input Validator</TITLE>
<meta name="description" content="Good, Tested, Regular Expression General Input Validators">
<meta name="keywords" content="Regular Expression General Input Validator,javascript Regular Expression General Input Validator,javascript, dhtml, DHTML">
<script language=javascript>
function check(){
var ck_general = /^[A-Za-z0-9! @\,\.\?_-]{1,50}$/;
d=document.form;
p=d.generalinput.value;
p=p.replace(/--/g," -");
p=p.replace(/--/g," -");
d.generalinput.value=p;
if (document.form.generalinput.value.search(ck_general)==-1)
{alert("Please only type letters, numbers, and ! @ , - . ? _ or space in your general input.");
d.generalinput.value='';d.generalinput.focus();return false}else
{alert("Your general input validated OK.");
d.generalinput.value='';d.generalinput.focus();return false}}
function check_edit(){
d=document.form2;
p=d.generalinput.value;
l=p.length;
if (l<1) {p=" ";}
p=p.replace(/--/g," -");
p=p.replace(/--/g," -");
//% + - & ; ` ' \ " | * ? ~ < > ^ ( ) [ ] { } $ THESE ARE INSECURE!
p=p.replace(/[^A-Za-z0-9! @\,\.\?_-]/g,"");
d.generalinput.value=p;
var r=confirm
("Press OK to accept "+p+" as general input or Cancel to reject it.");
if (r==true)
{alert("Accepted"); d.generalinput.value='';d.generalinput.focus();}else
{alert("It's rejected—try again.");
d.generalinput.value='';d.generalinput.focus();}
return false;}
function check_escape(){
d=document.form3;
p=d.generalinput.value;
l=p.length;
if (l<1) {p=" ";}
p=escape(p);
p = p.replace(/\//g,"%2F");
//characters above ASCII 127 are escaped
//as well, though they're not on keyboards
p = p.replace(/\?/g,"%3F");
// escape() doesn't do / ? = & @ + - so we escaped them separately
p = p.replace(/=/g,"%3D");
p = p.replace(/&/g,"%26");
p = p.replace(/@/g,"%40");
p = p.replace(/\+/g,"%2B");
p = p.replace(/-/g,"%2D");
d.generalinputescaped.value=p;
p=unescape(p);
d.generalinputunescaped.value=p;
return false;}
</script>
</HEAD>
<body>
<BR><BR><BR><BR>
<form style='margin-left:240px' name='form' action=" " method="POST" onsubmit="return check()">
Use letters, numbers, and <b>! @ , - . ? _ or space</b> in your general input.<br>
<INPUT maxLength="50" type="text" name="generalinput" size="50">
<INPUT TYPE="SUBMIT" value="Submit General Input">
<INPUT TYPE="RESET" value="reset">
</form>
<BR><BR><BR><BR>
<form style='margin-left:240px' name='form2' action=" " method="POST" onsubmit="return check_edit()">
Use letters, numbers, and <b>! @ , - . ? _ or space</b> in your general input.<br>Your general input will be edited is you goof.<br>
<INPUT maxLength="50" type="text" name="generalinput" size="50">
<INPUT TYPE="SUBMIT" value="Submit General Input">
<INPUT TYPE="RESET" value="reset">
</form>
<BR><BR><BR><BR>
<form style='margin-left:240px' name='form3' action=" " method="POST" onsubmit="return check_escape()">
Type whatever. Your general input will be escaped for safety.<br>
<INPUT maxLength="50" type="text" name="generalinput" size="50"> Type up to 50 characters<br>
<INPUT TYPE="SUBMIT" value="Submit Input to be Escaped"> <INPUT TYPE="RESET" value="reset"><br>
Escaped Escaped Escaped Escaped Escaped Escaped Escaped Escaped Escaped Escaped Escaped Escaped Escaped <br>
<INPUT style='margin-left:-239px' type="text" name="generalinputescaped" size="158">
<INPUT type="text" name="generalinputunescaped" size="50"> Unescaped<br>
</form>
</BODY>
</HTML>