Simple right? I needed to extract numbers from strings such as “12 L per 100 km”, “on-sale price: USD$299.99”, “2000 sq. ft.”, etc. PHP’s built-in FILTER_SANITIZE_NUMBER_FLOAT, however, was completely useless, since it just blindly filters out any character that isn’t a digit, decimal, or sign:
- “12 L per 100 k.m.” returns “12100..”,
- “on-sale price: USD$299.99” returns “-299.99”
- “2000 sq. ft.” returns “2000..”.
- Useless.
The PHP function below extracts numbers from strings far more reliably, with a few caveats:
- Only extracts the first number found in a string.
- Doesn’t support scientific notation (numbers with "e" or "E" in them).
- Doesn’t support French number formatting (thousands/millions/etc. separated by spaces, decimal point represented by a comma).
If anyone feels like remedying these shortcomings (or has any comments whatsoever about this function), please leave a comment.
P.S. Since diving back into coding about 6 weeks ago, I’ve been astonished at how quickly I’ve managed to get up to speed. The incredible smoothness of my learning curve is due ENTIRELY to hundreds of generous developers who’ve posted videos, tutorials and their own code on the net, solely so that others can learn from and use it. As a small contribution back to the community that has helped me so much, I humbly offer the PHP function below, for any and all to use.
<?php
// This function extracts a number from a string.
// It takes a single string as a parameter, and returns either a number with sign (+/-) (if found in input string) or NULL (no number found in string).
//
// Known limitations:
// * Does not support French number formatting (spaces instead of commas, decimal point represented by a comma).
// * Only extracts the first number found in a string.
// * Does not support scientific notation (numbers with "e" or "E" in them).
function extractNumber($string)
{
// Return NULL if input string is NULL:
if (!$string) {
return NULL;
}
// Break input string into an array of single characters:
$chars = str_split($string);
// Set up some arrays for later use:
$all_num_chars = array( "-",
"+",
".",
"0",
"1",
"2",
"3",
"4",
"5",
"6",
"7",
"8",
"9");
$digits_and_decimal = array(".",
"0",
"1",
"2",
"3",
"4",
"5",
"6",
"7",
"8",
"9");
$just_digits = array( "0",
"1",
"2",
"3",
"4",
"5",
"6",
"7",
"8",
"9");
foreach ($chars as $key => $char) {
if ($char == ",") {
// If a comma is found before a number has been encountered in the input string, skip it and iterate to next char.
if (!$number) {
continue;
}
// If a comma is found, make sure that it's preceded by a digit, followed by 3 digits and that the 4th digit is not a number, and that the comma is not found after a decimal:
if (in_array($chars[$key-1], $just_digits)
&& in_array($chars[$key+1], $just_digits)
&& in_array($chars[$key+2], $just_digits)
&& in_array($chars[$key+3], $just_digits)
&& !in_array($chars[$key+4], $just_digits)
&& !$decimal_found) {
continue; // $char is a "legit comma" and should be skipped, and the main loop should iterate to the next char.
} else { // $char is a "rogue comma" and the number found up to the rogue comma is returned:
return $number;
}
}
if ($number && !in_array($char, $all_num_chars)) { // If a $number has been started and $char is a non-numerical char, return $number:
return $number;
}
if (!$number && !in_array($char, $all_num_chars)) { // If a $number has not been started and $char is a non-numerical char, continue (iterate to next char):
continue;
}
if (in_array($char, $just_digits)) { // $char is a digit, and should be appended to $number.
$number .= $char;
continue;
}
if ($char == ".") {
if ($decimal_found) { // $char is a "rogue decimal" and the number up to the rogue decimal is returned:
return $number;
}
if (!in_array($chars[$key+1], $just_digits)) { // If the char following the decimal is not a number, return $number.
return $number;
}
// $char is a "legit decimal" and should be appended to $number.
$number .= $char;
$decimal_found = true;
continue;
} else { // $char is a sign (+ or -):
if (!$number && in_array($chars[$key + 1], $digits_and_decimal)) { // Sign occurs at beginning of number and should be added to $number.
$number .= $char;
continue;
}
if (!$number && !in_array($chars[$key + 1], $digits_and_decimal)) { // Sign occurs before the beginning of a number and should be ignored.
continue;
}
if ($number) { // Sign occurs in the middle of a number. Number before the sign is returned.
return $number;
}
}
}
return $number;
}
?>