Converting Strings to Numbers in Python
When working with data in Python, it's often necessary to convert strings that represent numbers into actual numerical values for manipulation and analysis. In this article, we explore methods for converting strings to integers and floats, as well as techniques to validate if a string is numeric, ensuring accurate conversions.
Using Built-in Methods: isdecimal()
and isnumeric()
Python includes several built-in methods: The methods isdecimal()
and isnumeric()
assess if a string can undergo conversion into a numeric type.
The method isdecimal()
checks for decimal digits (0-9), while isnumeric()
considers any character classified as numeric, including special characters.
Example:
val = input("Your input: ")
print(val)
print(val.isdecimal())
print(val.isnumeric())
If we run this example using "123" as an input, the string contains only decimal digits, causing both isdecimal()
and isnumeric() to return True.
In case of "²" (Superscript two), isdecimal()
returns False but isdecimal()
still recognizes "²" as a numeric character and returns True.
-
isdecimal()
: Recognizes only characters that represent standard decimal digits (0-9) and nothing else. These characters belong to the "Nd" (Number, Decimal Digit) category in Unicode. -
isnumeric()
: Includes all characters recognized byisdecimal()
, as well as other numeric characters such as:- Superscripts or subscripts (e.g.,
²
,₃
). - Fraction symbols (e.g.,
¼
,⅓
). - Roman numerals (e.g.,
Ⅷ
). - Full-width numeric characters used in East Asian languages (e.g.,
3
).
- Superscripts or subscripts (e.g.,
General Rule:
If a character represents a numeric value in Unicode but is not a standard decimal digit (0-9), it will be True
for isnumeric()
but False
for isdecimal()
.
The distinction between isdecimal()
and isnumeric()
in Python does not handle international variations in decimal separators (like .
versus ,
)
because these methods check only if the characters themselves represent numeric values.
They do not interpret strings as numeric values or account for localization.
Using Regular Expressions
In addition to Python's built-in methods for checking numeric strings, a
simple regular expression liker"^\d+$"
can also be used to achieve similar result.
Example:
import re
val = "123"
if re.match(r"^\d+$", val):
print("Valid integer")
else:
print("Nope - not a number")
How It Works:
^
: Anchors the match to the start of the string.\d
: Matches any digit (0–9). It is equivalent to the character class[0-9]
.+
: Ensures one or more digits are present.$
: Anchors the match to the end of the string.
Limitations
-
No Handling of Floating-Point Numbers: The regex does not account for decimal points (
.
or,
) and will reject valid floating-point numbers like"3.14"
or"2,718"
. -
No Support for Signs (+/-): It does not allow positive or negative signs (
+
or-
). For example,"-123"
and"+456"
will be invalid. -
Locale Issues: Similar to
isdecimal()
, this regex does not handle locale-specific formats, such as numbers with commas as the decimal separator ("3,14"
). -
Excludes Non-Standard Numeric Characters: It matches only standard digits (0-9). Numeric characters like superscripts (
²
) or fractions (⅓
) are excluded.
Improving the Regex
To give our regex a bit more flexibility, let's tweak it to handle various locale-specific formatting options, including the inclusion of +/- signs.
Optional +/- Sign:
r"^-?\d+$" # Allows optional leading "-" for negative numbers
r"^[+-]?\d+$" # Allows both "+" and "-" for signed numbers
To Validate Floating-Point Numbers:
r"^-?\d+(\.\d+)?$" # Allows integers and floats with an optional decimal point
r"^[+-]?\d+(\.\d+)?$" # Allows signed integers and floats
To Handle Locale-Specific Decimal Separators:
You can replace the decimal separator dynamically based on the locale:
import locale
# Example for a European locale
locale.setlocale(locale.LC_NUMERIC, 'de_DE.UTF-8')
decimal_separator = locale.localeconv()['decimal_point'] # Get the locale's decimal separator
# Build the regex dynamically
regex = rf"^[+-]?\d+({re.escape(decimal_separator)}\d+)?$"
Converting Strings to Integers or Floats
Python’s int()
and float()
functions handle the actual conversion of strings to numeric types.
If the input string contains non-numeric characters, a ValueError
will be raised:
Integer Conversion
a = "123"
b = int(a)
print(b) # Output: 123
Float Conversion
a = "12.3"
b = float(a)
print(b) # Output: 12.3
Exception Handling
a = "JBerries"
try:
b = int(a)
except ValueError:
print("Invalid input!") # Output: Invalid input!
Converting Locale-Specific Numbers
To handle such cases, you can:
- Replace the decimal separator according to the locale before converting.
- Use the
locale
module in Python for proper numeric parsing.
Example: Replacing Locale-Specific Separators
val = "3,14" # Example using comma as decimal separator
normalized_val = val.replace(",", ".")
try:
number = float(normalized_val)
print(f"Converted number: {number}")
except ValueError:
print("Invalid number format")
Example: Using the locale
Module
The locale
module can help interpret numbers correctly based on the user's locale.
import locale
# Set the locale (e.g., German, which uses ',' as a decimal separator)
locale.setlocale(locale.LC_NUMERIC, 'de_DE.UTF-8')
val = "3,14"
try:
number = locale.atof(val) # Convert using the locale-aware function
print(f"Converted number: {number}")
except ValueError:
print("Invalid number format")
In this article we showed ways to check and convert strings in Python:
- We used built-in tools like
isdigit()
,isdecimal()
, andisnumeric()
. - We also used regular expressions to handle different number formats.
- We showed how to safely change input strings into integers and floats using Python's
int()
andfloat()
functions. These skills are important for working with numbers in Python. Example code will be on GitHub soon.