Converting Strings to Numbers in Python

When working with data in Python, it's often necessary to convert strings that represent numbers into actual numerical values for manipulation and analysis. In this article, we explore methods for converting strings to integers and floats, as well as techniques to validate if a string is numeric, ensuring accurate conversions.

Using Built-in Methods: isdecimal() and isnumeric()

Python includes several built-in methods: The methods isdecimal() and isnumeric() assess if a string can undergo conversion into a numeric type. The method isdecimal() checks for decimal digits (0-9), while isnumeric() considers any character classified as numeric, including special characters.

Example:

val = input("Your input: ")
print(val)
print(val.isdecimal())
print(val.isnumeric())

If we run this example using "123" as an input, the string contains only decimal digits, causing both isdecimal()and isnumeric() to return True. In case of "²" (Superscript two), isdecimal() returns False but isdecimal() still recognizes "²" as a numeric character and returns True.

General Rule: If a character represents a numeric value in Unicode but is not a standard decimal digit (0-9), it will be True for isnumeric() but False for isdecimal().

The distinction between isdecimal() and isnumeric() in Python does not handle international variations in decimal separators (like . versus ,) because these methods check only if the characters themselves represent numeric values. They do not interpret strings as numeric values or account for localization.

Using Regular Expressions

In addition to Python's built-in methods for checking numeric strings, a simple regular expression liker"^\d+$"can also be used to achieve similar result.

Example:

import re
val = "123"
if re.match(r"^\d+$", val):
    print("Valid integer")
else:
    print("Nope - not a number")

How It Works:

Limitations

  1. No Handling of Floating-Point Numbers: The regex does not account for decimal points (. or ,) and will reject valid floating-point numbers like "3.14" or "2,718".

  2. No Support for Signs (+/-): It does not allow positive or negative signs (+ or -). For example, "-123" and "+456" will be invalid.

  3. Locale Issues: Similar to isdecimal(), this regex does not handle locale-specific formats, such as numbers with commas as the decimal separator ("3,14").

  4. Excludes Non-Standard Numeric Characters: It matches only standard digits (0-9). Numeric characters like superscripts (²) or fractions () are excluded.

Improving the Regex

To give our regex a bit more flexibility, let's tweak it to handle various locale-specific formatting options, including the inclusion of +/- signs.

Optional +/- Sign:

r"^-?\d+$"  # Allows optional leading "-" for negative numbers
r"^[+-]?\d+$"  # Allows both "+" and "-" for signed numbers

To Validate Floating-Point Numbers:

r"^-?\d+(\.\d+)?$"  # Allows integers and floats with an optional decimal point
r"^[+-]?\d+(\.\d+)?$"  # Allows signed integers and floats

To Handle Locale-Specific Decimal Separators:

You can replace the decimal separator dynamically based on the locale:

import locale

# Example for a European locale
locale.setlocale(locale.LC_NUMERIC, 'de_DE.UTF-8')  
decimal_separator = locale.localeconv()['decimal_point']  # Get the locale's decimal separator

# Build the regex dynamically
regex = rf"^[+-]?\d+({re.escape(decimal_separator)}\d+)?$"

Converting Strings to Integers or Floats

Python’s int() and float() functions handle the actual conversion of strings to numeric types. If the input string contains non-numeric characters, a ValueError will be raised:

Integer Conversion

a = "123"
b = int(a)
print(b)  # Output: 123

Float Conversion

a = "12.3"
b = float(a)
print(b)  # Output: 12.3

Exception Handling

a = "JBerries"
try:
    b = int(a)
except ValueError:
    print("Invalid input!")  # Output: Invalid input!

Converting Locale-Specific Numbers

To handle such cases, you can:

  1. Replace the decimal separator according to the locale before converting.
  2. Use the locale module in Python for proper numeric parsing.

Example: Replacing Locale-Specific Separators

val = "3,14"  # Example using comma as decimal separator
normalized_val = val.replace(",", ".")
try:
    number = float(normalized_val)
    print(f"Converted number: {number}")
except ValueError:
    print("Invalid number format")

Example: Using the locale Module

The locale module can help interpret numbers correctly based on the user's locale.

import locale

# Set the locale (e.g., German, which uses ',' as a decimal separator)
locale.setlocale(locale.LC_NUMERIC, 'de_DE.UTF-8')  

val = "3,14"
try:
    number = locale.atof(val)  # Convert using the locale-aware function
    print(f"Converted number: {number}")
except ValueError:
    print("Invalid number format")

In this article we showed ways to check and convert strings in Python: