= 'String A'
x = "Another variable"
var2 print(x)
print(var2)
String A
Another variable
In programming string refers to a sequence of characters that can act as a variable or constant. This is the most popular data type in Python. In fact the increasingly strong prevelance of Python in Bioinformatics is primarily due to its ability to easily perform different operations of strings. A string variable can be assigned a value using either single or double quotes.
= 'String A'
x = "Another variable"
var2 print(x)
print(var2)
String A
Another variable
To check the data type of a variable, we can use the type
function. The get number of character in a string variable use the len
function. Note that a blank space is also considered as a character.
print(type(x))
print(len(var2))
<class 'str'>
16
The process of joining two or more things is called as concatenation. The arithmetic operators + and * can be used directly with strings to concatenate (addition) or repeat (multiplication). This process of giving additional functions to operator (beyond their existing functions) is called as operator overloading. E.g. the plus (+) operator is used for addition given that the operands are integers. However, if the operands are string then it act as a concatenation operator instead of addition operator.
# The plus (+) operator with two numbers
2+3
5
= "Hello"
var1 = "World!"
var2 print (var1+var2)
HelloWorld!
In the case of asteriks (*) operator, which is used to multiple two numbers, when the operands are a string (s) and a number (n) the output is s repeated n times. This behaviour is similar to muliplication of two numbers. For instance, let say we want to multiple 5 by 3 (5*3). This multiplication can also be represented as sum of 5 three times i.e. 5+5+5. So, when we use a string (s) and a number (n) as operands for * operator we get s+s+s…(n times).
= var1*3
var3 print(var3)
HelloHelloHello
Slice is another very useful operator that can be used to manipulate strings. The slice operator []
gives the character within the start and end positions separated by a colon. The numbering of characters within a string start from 0. Note that the start position character is included in the output but the end position character is not. Slicing effectively return the substring of a given string. The general syntax for slicing a string is as follows:
string[start:end] string[start:] string[:end] string[start:end:step]
Let’s see some examples to get a better understanding to the slice operation.
= "ABCDEFG"
var4 print(var4)
print(var4[1:5])
ABCDEFG
BCDE
In case no value is specified before or after the colon then the slicing would occur from begining or till end respectively.
print(var3)
print(var3[:7])
print(var3[3:])
HelloHelloHello
HelloHe
loHelloHello
The step part in the slice operator specific the number of steps to take when going from the start position to the end position. The default step size is 1. We can change the default value by specifying the step parameter within the slice command.
print(var3)
print(var3[2::2])
HelloHelloHello
loelHlo
Quiz: Write a command that outputs ‘HHH’ given a string ‘HelloHelloHello’.
= "HelloHelloHello"
var3 print(var3[::5])
One of the frequently required tasks in programming is string comparison. In Python comparison operator can be used to compare two strings. The == (two equal symbols without space) is the comparison operator. The output of comparison is a boolean value i.e. either True or False. String comparison is case sensitive.
= 'Hello'
var1 = "Hello"
var2 = 'Hi'
var3 print(var1 == var2)
print(var1 == var3)
print(var1 == "hello")
True
False
False
Sometimes there is a need to split a string based on certain delimiters, the split
function is designed for that task. Python String types have split
function associated with them that return a list of elements after spliting the string as per the delimiter. The default delimiter is blank space.
= "This is a sentence."
s1 = s1.split()
words1 print(words1)
#split with comma as a delimiter
= 'This is an another sentence, a longer one.'
s2 = s2.split(",")
words2 print(words2)
['This', 'is', 'a', 'sentence.']
['This is an another sentence', ' a longer one.']
Quiz: What would be the output if we split s2 using “is” as a delimiter.
= 'This is an another sentence, a longer one.'
s2 print(s2.split("is"))
##Output would be a list with three elements:
##['Th', ' ', ' an another sentence, a longer one.']
Python strings have several methods to work with string objects. Below are examples of some of the functions available is class ‘str’. These methods acts on the string and returns a new string after doing the required manipulations. For additional functions, please refer to the python documentation.
s1 = "Apple"
String function Output
————————————————————————————————
s1.upper() APPLE
s1.lower() apple
s1.startswith("a") False
s1.startswith("A") True
s1.index("l") 3
s1.replace("e","es") Apples
To declare a variable whose value is a long string that spans multiple lines tripple quotes can be used. All white spaces such as tabs and newline are considered part of the string. These types of strings are generally used for documentation purposes e.g. writing help text for custom functions.
= """This is an example of
var4 a long string that spans
three lines."""
print (var4)
This is an example of
a long string that spans
three lines.
A string variable can store characters including white spaces (space, tab, newline). A string variable is an object of class ‘str’. To initialize a string variable, single or double quotes can be used.
In Python the strings are immutable i.e. their value cannot be changed once it has been assigned. The values can however be reassigned. The value of a string variable can change but the data contained within a variable can`t be changed.
A string variable is a also a list i.e. a collection of characters. We can iterate through characters in a string just like we can iterate through any list. Unlike lists, however, characters cannot be appended to a string because strings are a immutable data type.