String Operations in Python
This tutorial gives an introduction to string operations in Python: Starting with joining strings, determining the length and using the index operator to access individual characters, we also show how to iterate over strings, searching and comparing strings and briefly explain the standard methods for strings in Python.
- Length of a string
- Concatenating strings
- Strings and the index operator
- All characters from the 1st character on
- Characters between index 2 and 5
- The last 3 characters
- The first two characters
- The penultimate character
- Reverse the order of characters (invert string)
- String/Byte conversion and Encoding
- Iterating over strings
- Checking if another string is contained in a string
- Common string methods
Length of a string
One can determine the length of a string using the function len()
:
jberries = 'jberries'
len(jberries) # => 8
The Python interpreter provides a wide range of built-in functions and types that are accessible for use in any given script. The <a href="https://docs.python.org/3/library/functions.html#len">len</a> function belongs also to the build-in functions in Python.
Concatenating strings
To concatenate strings, you can use the plus "+" operator as often seen in other programming languages:
foo = "foo"
bar = "bar"
print(foo + bar) # => foobar
However, Python does not automatically convert to string if one of the variables is not a string,
and in this case, it raises an error. In such cases, we can simply use the "str()
" function:
foo = "foo"
bar = 5
print(foo + bar) # => TypeError: can only concatenate str (not "int") to str
print(foo + str(bar)) # => foo5
Strings and the index operator
With the "index operator," one can access the individual characters of a string in Python. It is very easy, for example, to extract the first or last x characters of a string or access a specific character. The index operator can handle both positive and negative indices:
- Positive values count from the beginning of the string,
- negative values from the end.
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
j | b | e | r | r | i | e | s |
-8 | -7 | -6 | -5 | -4 | -3 | -2 | -1 |
Here are some examples:
All characters from the 1st character on
print(jberries[1:]) # => berries
Characters between index 2 and 5
print(jberries[2:5]) # => err
The last 3 characters
print(jberries[-3:]) # => ies
The first two characters
print(jberries[:-6]) # => jb
# or
print(jberries[0:2]) # => jb
The penultimate character
print(jberries[-2]) # => e
Reverse the order of characters (invert string)
jberries = 'jberries'
jberries_inv = jberries[::-1]
print(jberries_inv) # => seirrebj
String/Byte conversion and Encoding
Converting a string with utf-8 encoding into a byte array:
test = bytes("😀JBerries", encoding = \'utf-8\'),
Afterwards, we can transform the byte array back into a string:
str(test, encoding = "ascii", errors ="ignore") # => JBerries
Here we have selected the ASCII encoding, which does not include the smiley character, and with errors ="ignore"
,
we suppress the error that would otherwise be thrown.
Iterating over strings
The simplest way is with a for loop:
for c in "jberries":
print (c)
If we only want to iterate over a part of the string, we can use range
in the for loop for this purpose:
jb = "jberries"
for i in range(2, 4):
print(jb[i])
# => e
# => r
Checking if another string is contained in a string
This is very simple again in Python:
if "jb" in "jberries":
print("jb is in jberries")
# => jb is in jberries
Common string methods
- s.lower(), s.upper() - returns the string in lowercase or uppercase
- s.strip() - returns the string without whitespace at the beginning and end of the string
(s = " Hello World " print(s.strip()) # Output: Hello World)
- s.isalpha() - checks if the string consists only of letters (including, for example, umlauts)
- s.isdigit() - checks if the string consists only of numbers
- s.isspace() - checks if the string consists only of "whitespace" characters (spaces, tabs, line breaks, etc.)
- s.startswith('foo') - checks if the string starts with a given string
- s.endswith('bar') - checks if the string ends with a given string
- s.find('foo') - checks if the given string appears in another string and returns the first index, or -1 otherwise
- s.replace('foo', 'bar') - returns the string in which the given search string has been replaced by another
- s.split('foo') - splits the string with the given delimiter (delimiter is not a regular expression here)
- s.join(list) - joins a list of strings with a given separator, e.g.,
', '.join(['a', 'b', 'c']) -> a, b, c