HTML FORMS — rewrite code samples using $clean conventions

NOTE: "name" attribute is deprecated, except for forms (and PHP $_POST array depends upon it) so you may want to use both "name" and "id".

FORM

Click a line of code to jump to examples.

<form id="myform" name="myform">

[The field here is hidden.]

checkbox 1

checkbox 2

checkbox 3

<!-- NAME groups radio buttons. ID allows LABEL to distinguish them. -->

radio button 1

radio button 2

radio button 3

</form>

...

Forms submit their values via ACTION. When SUBMIT is clicked, control is passed to the ACTION reference.

action="apage.html"

How the values arrive is determined by METHOD

method="post" or method="get"

Use GET with GET, POST with POST, REQUEST with either

$_GET['formfield']<form id="myform" name="myform" action="apage.html" method="get">
$_POST['formfield']<form id="myform" name="myform" action="apage.html" method="post">
$_REQUEST['formfield'](REQUEST processes variables transmitted by GET, POST, or COOKIE)

See below for examples of processing forms.

LABELS

<label for="lblname">Enter your last name:</label> <input id="lname" type="text" size="30" />

The "FOR" attribute associates the label with the field. For radio buttons and checkboxes, clicking the label yields the same result as clicking the input.

You can align your form fields by styling LABEL with CSS.

/* CSS rule */
label { width: 10em; float: left; } 

IE6, the polyp of web browsers, will often screw up floats. You may need to declare the containing DIV as position "relative" if it is inside a complex structure.

The "ACCESSKEY" attribute is supposed to work in LABEL, too, but I'm not getting any joy from it.

TEXT

<input type="text" id="xxx" name="xxx" size="12"  />
<input type="text" id="xxx" name="xxx" value="some value" size="12" maxlength="15"  />

... | top

Can include javascript events such as "onchange", "onmouseover", "onmouseout", etc.

The scripted version below loads field with PHP variable and upper cases the text after user has altered it:

<script language="JavaScript" type="text/javascript">

	function ucase(myfield) {
	var d = document.getElementById(myfield);
		if (d.value.length > 0) {
			d.value = d.value.toUpperCase();
		}
	}
</script>

<input type="text" size="12" maxlength="15" id="xxx" name="xxx" 
	value="<?php echo $xxxval; ?>" onchange="ucase('xxx')" />

PASSWORD

<input type="password" id="xxx" name="xxx" size="12" maxlength="25" />

Like text, but browser displays asterisks instead of text.

HIDDEN

<input type="hidden" id="xxx" name="xxx" />

Hidden elements are not secure. Anyone can look at the page's source and see and alter values stored there.
They can be convenient holders for values used by scripts or other pages, just don't trust them more than other form elements.

CHECKBOX

<input id="chk_1" name="chk_1" type="checkbox" checked="checked" />

... | top

The script below creates checkboxes based upon the value of bits in an integer.

<script language="JavaScript" type="text/javascript">

	function testbit($b,$f) {  // is bit ON, True/False
		return (($f >> $b) & 1==1);  // does (($flags shifted RIGHT by bit) AND 1) = 1 ?
	}

	function setbit($b,$f) { // turn bit ON
		return ((1 << $b) | $f);  // 1 shifted LEFT by bit, locates current bit, set with OR
	}

	function clearbit($b,$f) {  // turn bit OFF
		return ((0xFFFF - (1 << $b)) & $f);  // locate bit, subtract it  from  $FFFF, clear with AND 
	}
</script>

<?php
	# code assumes $flags is integer whose bits hold the value of each checkbox (16 boxes possible)
	# and $flag_title array[0..n] is loaded with titles retrieved from mysql_query

	for ($i=0; $i<$chk_max; $i++) {
		$checked = (testbit($i,$flags)) ? "checked='checked'" : '';
		echo "<p><input name='chk_" . $i . "' type='checkbox' $checked /> {$flag_title[$i]}</p>\n";
	}
?>

RADIO

<input type="radio" name="sex" value="male" checked="checked"> male <br />
<input type="radio" name="sex" value="female"> female <br />
<input type="radio" name="sex" value="other"> other <br />

A click on any button clears others of the same name.

BUTTON

<input type="button" value="button name" />

Buttons need to be associated with an "event" to be useful.

... | top

Here, ONCLICK passes a reference to the entire form, so doSomething() can access other form fields:

<input type="button" value="button name" onclick="doSomething(this.form)" />

The doSomething() script could access other form elements like this:

function doSomething(myform) {
	var ln = myform.lastname;
	var fn = myform.firstname;
	var fullname = fn + ' ' + ln;
}

RESET

<input type="reset" />

... | top

This example renames the button and confirms the action:

<form onReset="return confirm('This will undo your changes.  Proceed?')">
	<input type="reset" value="Reset Form">
</form>

An ONCLICK event could analyze other fields or vars to decide whether or not to allow or confirm:

<form . . . >
	<input type="reset" value="Reset Form" onclick="resetform()">
</form>

In general, RESET may be more confusing than useful. Users might confuse it with SUBMIT & lose their work.

TEXTAREA

<textarea name="notes" id="notes" rows="7" cols="65"></textarea>
<textarea name="notes" id="notes" rows="7" cols="65"><?php echo $mytext; ?></textarea>

SELECT OPTIONS: DROP-DOWN LISTS

<form id="myform" name="myform">
	<select id="mylist" name="mylist">
		<option value="" selected="selected"> </option>
		<option value='1' >option 1</option>
		<option value='2' >option 2</option>
		<option value='3' >option 3</option>
	</select>
</form>

... | top

To reference a selected option:

document.myform.mylist.options[document.myform.mylist.selectedIndex].value

To populate select options from query results:

// $oldselect stores a previous selection (if any)
<select id="mylist" name="mylist">
	<?php
		$sql = "select id, title from t_lists where groupID = $mygroup order by grpsort";
		$r = mysql_query($sql);
		while ($listrow = mysql_fetch_assoc($r)) {
			$selected = ($listrow['id'] == $oldselect) ? "selected='selected'" : '';
			echo "<option value='{$listrow['id']}' $selected>{$listrow['title']}</option>\n";
		}
	?>
</select>

To act on selection change — declare "onchange" event in SELECT itself:

<select id="mylist" name="mylist" onchange="window.location='mypage.php?var=' + 
	document.myform.mylist.options[document.myform.mylist.selectedIndex].value">

SELECT OPTIONS: STATIC LIST (with multiple selections)

<select id="mylist" name="mylist" multiple="multiple" size="3">
	<option value='1'> option 1 </option>
	<option value='2'> option 2 </option>
	<option value='3'> option 3 </option>
</select>

Multiple selects require use of CTRL or SHIFT keys.

... | top

Javascript to test for multiple selections in a list:

// call this function with "searchList('listname')"
function searchList(e);
	mylist = document.getElementById(e);
	for(var i=0; i < mylist.length; i++) {
		if (mylist.options[i].selected == true) {
		    //do something useful
		}
	}
}

SUBMIT

<input type="submit" id="OK" name="OK" value="OK" default="yes" /> 

... | top

In this example, javascript checks for required fields prior to submitting form.

<script language="JavaScript" type="text/javascript">
	function chkfield(f) {
	// verify form field has some length
		var e = document.getElementById(f);
		if (e.value.length <= 0) {
			e.focus();
			return false;
			} else {
			return true;
		}
	}

	function checkFields() {
		var OK = 'Y';
		/* check in reverse order so focus winds up on first unfilled field */
		if (!chkfield("field2")) { OK = 'N'; }
		if (!chkfield("field1")) { OK = 'N'; }
		if (OK == 'N') {
			alert("Please provide information for each category.");
			return false;
		} else {
			return true;
		}
	}
</script>

<input type="submit" id="OK" name="OK" value="OK" default="yes" onclick="checkFields()" /> 
<input type="submit" id="cancel" name="cancel" value="cancel" onclick="window.close()") />

You can approximate the "submit" type through a standard HREF tag combined with javascript.
In the next example, the HREF is pointed to a non-existent ID (#),
and the browser's normal attempt to locate it is aborted by "return false" — which is appended to whatever scripts are invoked.

<a href="#" onclick="someScript(); return false;"> OK </a>

Processing form values with PHP

Below is a FORM that POSTs its values to the same page — PHP decides whether to process or display.

<?php
	if (isset($_POST['xxx'])) {

		// . . . process variables
		// . . . redirect to page in parent folder
		


"check_input" (below) is a function you would create to test the host name

$clean('host') = ( check_input( $_SERVER['HTTP_HOST'],'host' ) ) ? $_SERVER['HTTP_HOST'] : false; if ($clean['host']) { $html['host'] = htmlentities($clean['host']); // not sure about this header("Location: http://".$html['host'].dirname($_SERVER['PHP_SELF'])."/../apage.html"); exit; } else { die; } } else { ?>

regular HTML goes here, including . . .

<form id="myform" name="myform" action="<?php echo $_SERVER['PHP_SELF']; ?>" method="post"> <p>Required Field: <input type="text" id="xxx" name="xxx" size="12" /></p> </form> <?php } ?>

... | top

ACTION can add arguments, even if the method is POST.

(Process arguments with GET, forms values with POST, or both with REQUEST.)

action="<?php echo $_SERVER['PHP_SELF'] . "?arg1=$arg1&arg2-arg2"; ?>" method="post"

If you display error messages as you process form values, then redirect with javascript.

<script language="JavaScript1.2">
	location.href = "../apage.html";   /* go to a page in parent folder */
</script>

clean input / escape outputNECESSARY CLEAN-UP — CHECK YOUR ASSUMPTIONS

Although you might create a single file to display and process a web form, that doesn't mean incoming data were created by your form.

A wiley user can analyze your form, and send your script any malicious code he chooses. The phone number you think you're getting might be a specially-crafted bit of script designed to deface your page, break your application, or steal information.

Assume all incoming data are tainted. As the graphic shows, this includes PHP arrays and the results from HTTP requests or MySQL queries. All could contain unexpected, corrupt, or malicious data.

The following is based on suggestions from Chris Shiflett's PHP Security Briefing (which, along with the rest of his brainbulb site, seems to have vanished).

A consistent naming convention helps show what you can trust, and what's ready for output. Shifflet suggests 3 arrays for keeping track of variables:

$clean = array();    // stores all filtered data (none of it escaped)
$mysql = array();   // stores escaped data ready for insertion into database
$html = array();     // stores escaped data for display or transmittal to another web resource

The idea is simple. No input to your application is trusted until it has been tested and placed inside $clean. $clean is the single source of data for futher processing.

And $clean is the only source of data for $mysql and $html. (They never draw data from each other because their data are escaped for different contexts.) They become the sole sources of data for SQL queries or HTML output.

FILTER ALL INPUT

Use a "whitelist" approach   When possible, allow a set of known values, rather than try to anticipate all bad values.

switch ($val)
{
case 'apples' :
case 'oranges' :
case 'bananas' :
	return true;
	break;
default :
	return false;
} 

Test input length   If too big, generate an error message or lop it off (as shown here).

$clean['val'] = (strlen($_POST['val'])>$max) ? substr($_POST['val'],0,$max-1) : $_POST['val'];

Test character type   (PHP provides several true/false character type functions.)

$clean['val'] = ctype_alnum($_POST['val']);  // does $_POST['val'] contain only alphanumeric characters?

Weed out potential trouble   PHP can strip tags and script functions.

$clean['val'] = strip_tags($_POST['val'],'<b><i><u>');  // remove all HTML tags, except b, i, & u

... | top

For simple string replacements, use STR_REPLACE — much faster than EREG_REPLACE or PREG_REPLACE. The first example illustrates how the contents of one array can replace the substrings listed in another array. (htmlentities() would be a better tool for this particular case.)

$example = "<p>This is a paragraph.</p>";
echo str_replace( array('&lt;','&gt;'), array('<','>'), $example);

You can replace several strings with a single string, using str_replace or regular expressions. Here are 2 examples of replacing common javascript function names with "forbidden." (The 2nd example comes from "guestbook." It has the advantage of ignoring case.)

$stripAttrib = array( "javascript:","onload","onclick","ondblclick","onmousedown","onmouseup","onmouseover"
		"onmousemove","onmouseout","onkeypress","onkeydown","onkeyup","style","class","id" );
str_replace( $stripAttrib, 'forbidden', $value);

$stripAttrib = 'javascript:|onload|onclick|ondblclick|onmousedown|onmouseup|onmouseover|
		onmousemove|onmouseout|onkeypress|onkeydown|onkeyup|style|class|id';
preg_replace("/$stripAttrib/i", 'forbidden', $value);

Test input pattern   Regular expressions can check for complex expected patterns. Here's a cheat sheet.

The next code allows 4-28 letters, numbers, or underscores (and ignores case).

$clean['val'] = (preg_match('/^[\w]{4,28}$/i', $_POST['val']) ) ? $_POST['val'] : false;  // set false if match fails

... | top

All patterns your form might generate could be tested in a single function.

The next code tests 'lastname' (sent via POST) against an expected "name" pattern. If the pattern matches, $clean['lastname'] is set to $_POST['lastname'], otherwise to FALSE.

$clean = array();
$clean['lastname']=(checkinput($_POST['lastname'],'name')) ? $_POST['lastname'] : false;  

function checkinput( $s, $type ) {
  switch ($type) {
    case "name" :  // allow any number of a-z, hyphen, space, period, apostrophe (ignore case)
      return preg_match( '/^[a-z-\'\. ]*$/i' , $s );  
      break;
    case "int" : 
      return preg_match('/^[0-9]{1,8}$/' , $s);  // 1-8 digits between 0 and 9
      break;
    case "email" :  // allow dots in account.name, only 2-4 characters in domain (ignore case)
      return preg_match('/^[a-z]{1}[\w]+([.][\w]+)*[@][\w]+([.][\w]+)*[.][a-z]{2,4}$/i', $s );
      break;
    case "USphone" :  // US-style phone format
      return preg_match('/(\()?\d{3}(?(1)\) ?|[-/ \.])\d{3}[- \.]\d{4}$/', $s);
      break;
  }  // end switch
}

ESCAPE ALL OUTPUT — MYSQL_REAL_ESCAPE_STRING() and HTMLENTITIES()

A consistent naming convention helps here. If the $clean array never holds escaped data, you know its data should be escaped before leaving your application.

$mysql['val'] = mysql_real_escape_string($clean['val']);
$result = mysql_query( " insert into users set colname = '{$mysql['val']}' " );
$html['val'] = htmlentities($clean['val']);
echo "<p>{$html['val']}</p>";

PHP's "foreach" construction provides a rapid way to escape all cleaned data:

foreach ($clean as $k => $v) {
	$mysql[$k] = mysql_real_escape_string($v);
}

foreach ($clean as $k => $v) {
	$html[$k] = htmlentities($v);
}

Both functions preserve characters that might have special meaning in another context. If PHP lacks a routine for your database server, fall back to addslashes().

mysql_escape_real_string protects special characters and can guard against some forms of attack. (See the next link for extended examples.)

... | top

Accepting input at face value can lead to situations like this. Assume a user enters his lastname as " O'Rourke ", and your script processes it this way:

$lastname = $_GET['lastname'];
mysql_query("insert into users set lastname='$lastname'");
The resulting SQL statement becomes:   " insert into users set lastname='O'Rourke' ".
The apostrophe truncates lastname to 'O', and the trailing Rourke' generates a syntax error.

Mysql_real_escape_string() "escapes" potentially troublesome characters by adding backslashes to them.

$mysql['lastname'] = mysql_real_escape_string($clean['lastname']);
mysql_query("insert into users set lastname={$mysql['lastname']}");
The SQL now is:   " insert into users set lastname='O\'Rourke' ".
The slash causes the database server to see the apostrophe as a literal character, so the query inserts "O'Rourke" into the users table.

NOTE: The slashes added by mysql_real_escape_string() do not wind up in the table — they just preserve the punctation your user entered.

More importantly, mysql_real_escape_string() helps ward off some types of SQL injection attacks.

Say your database host allows stacked queries. (MySQL doesn't, but MSSQL does.) You could lose your entire users table if a malicious user entered this as his last name: " '; drop table users -- ".
mssql_query("insert into users set lastname='$lastname'");
The SQL now resolves to:   " insert into users set lastname=' '; drop table users -- ' ".
The malicious user has injected a 2nd, destructive SQL statement.
The "--" instructs SQL to ignore the rest of the line, so the trailing ' doesn't create an error.

Mysql_real_escape_string() neutralizes this attack by escaping the initial single quote.

These lines should provide a dual line of defense:
$mysql['lastname'] = mysql_real_escape_string($clean['lastname']);
mssql_query("insert into users set lastname={$mysql['lastname']}");
Your filtering shouldn't allow semi-colons in a lastname, so $clean['lastname'] should be null or false.
is_name($v) {
	// does $v contain only letters, apostrophes, hyphens, or white space?
	return preg_match( "/^[a-zA-Z'-\s]+$/" , $v )  // return true or false
}

$clean['lastname'] = (is_name($_POST['lastname'])) ? $_POST['lastname'] : false; 
If your filtering fails, mysql_real_escape_string would cause your SQL to become:
" insert into user set lastname=' \'; drop table users -- ' ".
Lastname would receive a nonsense value, but the users table would be unscathed.

But mysql_real_escape_string() can't guard against all types of injection.

Say your form feeds your script $_POST['selectedname'], when a user chooses a name in a <SELECT> field.
You limited <SELECT> to a subset of names, so you trust $_POST['selectedname'] to set a WHERE condition.
$result = $mysql_query( "select * from users where lastname='{$_POST['selectedname']}' ");
Unfortunately, you can't depend on the <SELECT> field to limit your users' choices.
An attacker could construct a doctored form so that $_POST['selectedname'] contains " ' or ' '= ' ".

The SQL becomes:   "select * from users where lastname =' ' or ' ' = ' ' ";

Because " or ' ' = ' ' " will always be true, this will return the entire table to your attacker, from which he might gain hints about how to proceed.

This attack is thwarted if you filter $_POST['selectedname'], along these lines:
$clean['lastname'] = ( is_name($_POST['lastname']) ) ? $_POST['lastname'] : false;

if ($clean['$lastname']) {
	$mysql['lastname'] = mysql_real_escape_string($clean['lastname']);
	$result = $mysql_query(" select * from users where lastname= '{$mysql['lastname']}' ");
} else {
	// handle error here, perhaps:
	$html['error'] = htmlentities("Sorry.  I don't recognize your selection.");
	echo "<p>{$html['error']}</p>";
}

htmlentities protects text in URLs, and is a handy way to display source code.

$clean['msg'] = "<p>Here's the markup.</p>";
$html['msg'] = htmlentities($clean['msg']);
echo "<p>{$html['msg']}</p>";

If you want to display data about to be committed to the database, or provide a summary after it has been, your code should look something like this:

$mysql['val'] = mysql_real_escape_string($clean['val']);
$html['val'] = htmlentities($clean['val']);
echo "<p>You are about to insert {$html['val']} into the database.</p>";