Comparison of programming languages (string functions)
String functions are used in computer programming languages to manipulate a string or query information about a string (some do both).
Most programming languages that have a string datatype will have some string functions although there may be other low-level ways within each language to handle strings directly. In object-oriented languages, string functions are often implemented as properties and methods of string objects. In functional and list-based languages a string is represented as a list (of character codes), therefore all list-manipulation procedures could be considered string functions. However such languages may implement a subset of explicit string-specific functions as well.
For function that manipulate strings, modern object-oriented languages, like C# and Java have immutable strings and return a copy (in newly allocated dynamic memory), while others, like C manipulate the original string unless the programmer copies data to a new string. See for example Concatenation below.
The most basic example of a string function is the length(string)
function. This function returns the length of a string literal.
- e.g.
length("hello world")
would return 11.
Other languages may have string functions with similar or exactly the same syntax or parameters or outcomes. For example, in many languages the length function is usually represented as len(string). The below list of common functions aims to help limit this confusion.
Common string functions (multi language reference)
String functions common to many languages are listed below, including the different names used. The below list of common functions aims to help programmers find the equivalent function in a language. Note, string concatenation and regular expressions are handled in separate pages. Statements in guillemets (« … ») are optional.
CharAt
{ Example in Pascal }
var
MyStr: string = 'Hello, World';
MyChar: Char;
begin
MyChar := MyStr[2]; // 'e'
# Example in ALGOL 68 # "Hello, World"[2]; // 'e'
// Example in C
#include <stdio.h> // for printf
char MyStr[] = "Hello, World";
printf("%c", *(MyStr+1)); // 'e'
printf("%c", *(MyStr+7)); // 'W'
printf("%c", MyStr[11]); // 'd'
printf("%s", MyStr); // 'Hello, World'
printf("%s", "Hello(2), World(2)"); // 'Hello(2), World(2)'
// Example in C++
#include <iostream> // for "cout"
#include <string.h> // for "string" data type
using namespace std;
char MyStr1[] = "Hello(1), World(1)";
string MyStr2 = "Hello(2), World(2)";
cout << "Hello(3), World(3)"; // 'Hello(3), World(3)'
cout << MyStr2[6]; // '2'
cout << MyStr1.substr (5, 3); // '(1)'
// Example in C#
"Hello, World"[2]; // 'l'
# Example in Perl 5
substr("Hello, World", 1, 1); # 'e'
# Examples in Python
"Hello, World"[2] # 'l'
"Hello, World"[-3] # 'r'
# Example in Raku
"Hello, World".substr(1, 1); # 'e'
' Example in Visual Basic
Mid("Hello, World",2,1)
' Example in Visual Basic .NET
"Hello, World".Chars(2) ' "l"c
" Example in Smalltalk "
'Hello, World' at: 2. "$e"
//Example in Rust
"Hello, World".chars().nth(2); // Some('l')
Compare (integer result)
# Example in Perl 5
"hello" cmp "world"; # returns -1
# Example in Python
cmp("hello", "world") # returns -1
# Examples in Raku
"hello" cmp "world"; # returns Less
"world" cmp "hello"; # returns More
"hello" cmp "hello"; # returns Same
/** Example in Rexx */
compare("hello", "world") /* returns index of mismatch: 1 */
; Example in Scheme
(use-modules (srfi srfi-13))
; returns index of mismatch: 0
(string-compare "hello" "world" values values values)
Compare (relational operator-based, Boolean result)
% Example in Erlang
"hello" > "world". % returns false
# Example in Raku
"art" gt "painting"; # returns False
"art" lt "painting"; # returns True
# Example in Windows PowerShell
"hello" -gt "world" # returns false
;; Example in Common Lisp
(string> "art" "painting") ; returns nil
(string< "art" "painting") ; returns non nil
Concatenation
{ Example in Pascal }
'abc' + 'def'; // returns "abcdef"
// Example in C#
"abc" + "def"; // returns "abcdef"
' Example in Visual Basic
"abc" & "def" ' returns "abcdef"
"abc" + "def" ' returns "abcdef"
"abc" & Null ' returns "abc"
"abc" + Null ' returns Null
// Example in D
"abc" ~ "def"; // returns "abcdef"
;; Example in common lisp
(concatenate 'string "abc " "def " "ghi") ; returns "abc def ghi"
# Example in Perl 5
"abc" . "def"; # returns "abcdef"
"Perl " . 5; # returns "Perl 5"
# Example in Raku
"abc" ~ "def"; # returns "abcdef"
"Perl " ~ 6; # returns "Perl 6"
Contains
¢ Example in ALGOL 68 ¢ string in string("e", loc int, "Hello mate"); ¢ returns true ¢ string in string("z", loc int, "word"); ¢ returns false ¢
// Example In C#
"Hello mate".Contains("e"); // returns true
"word".Contains("z"); // returns false
# Example in Python
"e" in "Hello mate" # returns true
"z" in "word" # returns false
# Example in Raku
"Good morning!".contains('z') # returns False
"¡Buenos días!".contains('í'); # returns True
" Example in Smalltalk "
'Hello mate' includesSubstring: 'e' " returns true "
'word' includesSubstring: 'z' " returns false "
Equality
Tests if two strings are equal. See also #Compare and #Compare. Note that doing equality checks via a generic Compare with integer result is not only confusing for the programmer but is often a significantly more expensive operation; this is especially true when using "C-strings".
// Example in C#
"hello" == "world" // returns false
' Example in Visual Basic
"hello" = "world" ' returns false
# Examples in Perl 5
'hello' eq 'world' # returns 0
'hello' eq 'hello' # returns 1
# Examples in Raku
'hello' eq 'world' # returns False
'hello' eq 'hello' # returns True
# Example in Windows PowerShell
"hello" -eq "world" # returns false
⍝ Example in APL
'hello' ≡ 'world' ⍝ returns 0
Find
Examples
- Common Lisp
(search "e" "Hello mate") ; returns 1 (search "z" "word") ; returns NIL
- C#
"Hello mate".IndexOf("e"); // returns 1 "Hello mate".IndexOf("e", 4); // returns 9 "word".IndexOf("z"); // returns -1
- Raku
"Hello, there!".index('e') # returns 1 "Hello, there!".index('z') # returns Nil
- Scheme
(use-modules (srfi srfi-13)) (string-contains "Hello mate" "e") ; returns 1 (string-contains "word" "z") ; returns #f
- Visual Basic
' Examples in InStr("Hello mate", "e") ' returns 2 InStr(5, "Hello mate", "e") ' returns 10 InStr("word", "z") ' returns 0
- Smalltalk
'Hello mate' indexOfSubCollection:'ate' "returns 8"
'Hello mate' indexOfSubCollection:'late' "returns 0"
I'Hello mate' indexOfSubCollection:'late' ifAbsent:[ 99 ] "returns 99"
'Hello mate' indexOfSubCollection:'late' ifAbsent:[ self error ] "raises an exception"
Find character
// Examples in C#
"Hello mate".IndexOf('e'); // returns 1
"word".IndexOf('z') // returns -1
; Examples in Common Lisp
(position #\e "Hello mate") ; returns 1
(position #\z "word") ; returns NIL
Format
// Example in C#
String.Format("My {0} costs {1:C2}", "pen", 19.99); // returns "My pen costs $19.99"
// Example in Object Pascal (Delphi)
Format('My %s costs $%2f', ['pen', 19.99]); // returns "My pen costs $19.99"
// Example in Java
String.format("My %s costs $%2f", "pen", 19.99); // returns "My pen costs $19.99"
# Examples in Raku
sprintf "My %s costs \$%.2f", "pen", 19.99; # returns "My pen costs $19.99"
1.fmt("%04d"); # returns "0001"
# Example in Python
"My %s costs $%.2f" % ("pen", 19.99); # returns "My pen costs $19.99"
"My {0} costs ${1:.2f}".format("pen", 19.99); # returns "My pen costs $19.99"
#Example in Python 3.6+
pen = "pen"
f"My {pen} costs {19.99}" #returns "My pen costs 19.99"
; Example in Scheme
(format "My ~a costs $~1,2F" "pen" 19.99) ; returns "My pen costs $19.99"
/* example in PL/I */
put string(some_string) edit('My ', 'pen', ' costs', 19.99)(a,a,a,p'$$$V.99')
/* returns "My pen costs $19.99" */
Inequality
Tests if two strings are not equal. See also #Equality.
// Example in C#
"hello" != "world" // returns true
' Example in Visual Basic
"hello" <> "world" ' returns true
;; Example in Clojure
(not= "hello" "world") ; ⇒ true
# Example in Perl 5
'hello' ne 'world' # returns 1
# Example in Raku
'hello' ne 'world' # returns True
# Example in Windows PowerShell
"hello" -ne "world" # returns true
index
see #Find
indexof
see #Find
instr
see #Find
instrrev
see #rfind
join
// Example in C#
String.Join("-", {"a", "b", "c"}) // "a-b-c"
" Example in Smalltalk "
#('a' 'b' 'c') joinUsing: '-' " 'a-b-c' "
# Example in Perl 5
join( '-', ('a', 'b', 'c')); # 'a-b-c'
# Example in Raku
<a b c>.join('-'); # 'a-b-c'
# Example in Python
"-".join(["a", "b", "c"]) # 'a-b-c'
# Example in Ruby
["a", "b", "c"].join("-") # 'a-b-c'
; Example in Scheme
(use-modules (srfi srfi-13))
(string-join '("a" "b" "c") "-") ; "a-b-c"
lastindexof
see #rfind
left
# Example in Raku
"Hello, there!".substr(0, 6); # returns "Hello,"
/* Examples in Rexx */
left("abcde", 3) /* returns "abc" */
left("abcde", 8) /* returns "abcde " */
left("abcde", 8, "*") /* returns "abcde***" */
; Examples in Scheme
(use-modules (srfi srfi-13))
(string-take "abcde", 3) ; returns "abc"
(string-take "abcde", 8) ; error
' Examples in Visual Basic
Left("sandroguidi", 3) ' returns "san"
Left("sandroguidi", 100) ' returns "sandroguidi"
len
see #length
length
// Examples in C#
"hello".Length; // returns 5
"".Length; // returns 0
# Examples in Erlang
string:len("hello"). % returns 5
string:len(""). % returns 0
# Examples in Perl 5
length("hello"); # returns 5
length(""); # returns 0
# Examples in Raku
"🏳️🌈".chars; chars "🏳️🌈"; # both return 1
"🏳️🌈".codes; codes "🏳️🌈"; # both return 4
"".chars; chars ""; # both return 0
"".codes; codes ""; # both return 0
' Examples in Visual Basic
Len("hello") ' returns 5
Len("") ' returns 0
//Examples in Objective-C
[@"hello" Length] //returns 5
[@"" Length] //returns 0
-- Examples in Lua
("hello"):len() -- returns 5
#"" -- returns 0
locate
see #Find
Lowercase
// Example in C#
"Wiki means fast?".ToLower(); // "wiki means fast?"
; Example in Scheme
(use-modules (srfi srfi-13))
(string-downcase "Wiki means fast?") ; "wiki means fast?"
/* Example in C */
#include <ctype.h>
#include <stdio.h>
int main(void) {
char string[] = "Wiki means fast?";
int i;
for (i = 0; i < sizeof(string) - 1; ++i) {
/* transform characters in place, one by one */
string[i] = tolower(string[i]);
}
puts(string); /* "wiki means fast?" */
return 0;
}
# Example in Raku
"Wiki means fast?".lc; # "wiki means fast?"
mid
see #substring
partition
# Examples in Python
"Spam eggs spam spam and ham".partition('spam') # ('Spam eggs ', 'spam', ' spam and ham')
"Spam eggs spam spam and ham".partition('X') # ('Spam eggs spam spam and ham', "", "")
# Examples in Perl 5 / Raku
split /(spam)/, 'Spam eggs spam spam and ham' ,2; # ('Spam eggs ', 'spam', ' spam and ham');
split /(X)/, 'Spam eggs spam spam and ham' ,2; # ('Spam eggs spam spam and ham');
replace
// Examples in C#
"effffff".Replace("f", "jump"); // returns "ejumpjumpjumpjumpjumpjump"
"blah".Replace("z", "y"); // returns "blah"
// Examples in Java
"effffff".replace("f", "jump"); // returns "ejumpjumpjumpjumpjumpjump"
"effffff".replaceAll("f*", "jump"); // returns "ejump"
// Examples in Raku
"effffff".subst("f", "jump", :g); # returns "ejumpjumpjumpjumpjumpjump"
"blah".subst("z", "y", :g); # returns "blah"
' Examples in Visual Basic
Replace("effffff", "f", "jump") ' returns "ejumpjumpjumpjumpjumpjump"
Replace("blah", "z", "y") ' returns "blah"
# Examples in Windows PowerShell
"effffff" -replace "f", "jump" # returns "ejumpjumpjumpjumpjumpjump"
"effffff" -replace "f*", "jump" # returns "ejump"
reverse
" Example in Smalltalk "
'hello' reversed " returns 'olleh' "
# Example in Perl 5
reverse "hello" # returns "olleh"
# Example in Raku
"hello".flip # returns "olleh"
# Example in Python
"hello"[::-1] # returns "olleh"
; Example in Scheme
(use-modules (srfi srfi-13))
(string-reverse "hello") ; returns "olleh"
rfind
; Examples in Common Lisp
(search "e" "Hello mate" :from-end t) ; returns 9
(search "z" "word" :from-end t) ; returns NIL
// Examples in C#
"Hello mate".LastIndexOf("e"); // returns 9
"Hello mate".LastIndexOf("e", 4); // returns 1
"word".LastIndexOf("z"); // returns -1
# Examples in Perl 5
rindex("Hello mate", "e"); # returns 9
rindex("Hello mate", "e", 4); # returns 1
rindex("word", "z"); # returns -1
# Examples in Raku
"Hello mate".rindex("e"); # returns 9
"Hello mate".rindex("e", 4); # returns 1
"word".rindex('z'); # returns Nil
' Examples in Visual Basic
InStrRev("Hello mate", "e") ' returns 10
InStrRev(5, "Hello mate", "e") ' returns 2
InStrRev("word", "z") ' returns 0
right
// Examples in Java; extract rightmost 4 characters
String str = "CarDoor";
str.substring(str.length()-4); // returns 'Door'
# Examples in Raku
"abcde".substr(*-3); # returns "cde"
"abcde".substr(*-8); # 'out of range' error
/* Examples in Rexx */
right("abcde", 3) /* returns "cde" */
right("abcde", 8) /* returns " abcde" */
right("abcde", 8, "*") /* returns "***abcde" */
; Examples in Scheme
(use-modules (srfi srfi-13))
(string-take-right "abcde", 3) ; returns "cde"
(string-take-right "abcde", 8) ; error
' Examples in Visual Basic
Right("sandroguidi", 3) ' returns "idi"
Right("sandroguidi", 100) ' returns "sandroguidi"
rpartition
# Examples in Python
"Spam eggs spam spam and ham".rpartition('spam') ### ('Spam eggs spam ', 'spam', ' and ham')
"Spam eggs spam spam and ham".rpartition('X') ### ("", "", 'Spam eggs spam spam and ham')
slice
see #substring
split
// Example in C#
"abc,defgh,ijk".Split(','); // {"abc", "defgh", "ijk"}
"abc,defgh;ijk".Split(',', ';'); // {"abc", "defgh", "ijk"}
% Example in Erlang
string:tokens("abc;defgh;ijk", ";"). % ["abc", "defgh", "ijk"]
// Examples in Java
"abc,defgh,ijk".split(","); // {"abc", "defgh", "ijk"}
"abc,defgh;ijk".split(",|;"); // {"abc", "defgh", "ijk"}
{ Example in Pascal }
var
lStrings: TStringList;
lStr: string;
begin
lStrings := TStringList.Create;
lStrings.Delimiter := ',';
lStrings.DelimitedText := 'abc,defgh,ijk';
lStr := lStrings.Strings[0]; // 'abc'
lStr := lStrings.Strings[1]; // 'defgh'
lStr := lStrings.Strings[2]; // 'ijk'
end;
# Examples in Perl 5
split(/spam/, 'Spam eggs spam spam and ham'); # ('Spam eggs ', ' ', ' and ham')
split(/X/, 'Spam eggs spam spam and ham'); # ('Spam eggs spam spam and ham')
# Examples in Raku
'Spam eggs spam spam and ham'.split(/spam/); # (Spam eggs and ham)
split(/X/, 'Spam eggs spam spam and ham'); # (Spam eggs spam spam and ham)
sprintf
see #Format
strip
see #trim
strcmp
substring
// Examples in C#
"abc".Substring(1, 1): // returns "b"
"abc".Substring(1, 2); // returns "bc"
"abc".Substring(1, 6); // error
;; Examples in Common Lisp
(subseq "abc" 1 2) ; returns "b"
(subseq "abc" 2) ; returns "c"
% Examples in Erlang
string:substr("abc", 2, 1). % returns "b"
string:substr("abc", 2). % returns "bc"
# Examples in Perl 5
substr("abc", 1, 1); # returns "b"
substr("abc", 1); # returns "bc"
# Examples in Raku
"abc".substr(1, 1); # returns "b"
"abc".substr(1); # returns "bc"
# Examples in Python
"abc"[1:2] # returns "b"
"abc"[1:3] # returns "bc"
/* Examples in Rexx */
substr("abc", 2, 1) /* returns "b" */
substr("abc", 2) /* returns "bc" */
substr("abc", 2, 6) /* returns "bc " */
substr("abc", 2, 6, "*") /* returns "bc****" */
Uppercase
// Example in C#
"Wiki means fast?".ToUpper(); // "WIKI MEANS FAST?"
# Example in Perl 5
uc("Wiki means fast?"); # "WIKI MEANS FAST?"
# Example in Raku
uc("Wiki means fast?"); # "WIKI MEANS FAST?"
"Wiki means fast?".uc; # "WIKI MEANS FAST?"
/* Example in Rexx */
translate("Wiki means fast?") /* "WIKI MEANS FAST?" */
/* Example #2 */
A='This is an example.'
UPPER A /* "THIS IS AN EXAMPLE." */
/* Example #3 */
A='upper using Translate Function.'
Translate UPPER VAR A Z /* Z="UPPER USING TRANSLATE FUNCTION." */
; Example in Scheme
(use-modules (srfi srfi-13))
(string-upcase "Wiki means fast?") ; "WIKI MEANS FAST?"
' Example in Visual Basic
UCase("Wiki means fast?") ' "WIKI MEANS FAST?"
trim
trim
or strip
is used to remove whitespace from the beginning, end, or both beginning and end, of a string.
Other languages
In languages without a built-in trim function, it is usually simple to create a custom function which accomplishes the same task.
APL
APL can use regular expressions directly:
Trim←'^ +| +$'⎕R''
Alternatively, a functional approach combining Boolean masks that filter away leading and trailing spaces:
Trim←{⍵/⍨(∨\∧∘⌽∨\∘⌽)' '≠⍵}
Or reverse and remove leading spaces, twice:
Trim←{(∨\' '≠⍵)/⍵}∘⌽⍣2
AWK
In AWK, one can use regular expressions to trim:
ltrim(v) = gsub(/^[ \t]+/, "", v)
rtrim(v) = gsub(/[ \t]+$/, "", v)
trim(v) = ltrim(v); rtrim(v)
or:
function ltrim(s) { sub(/^[ \t]+/, "", s); return s }
function rtrim(s) { sub(/[ \t]+$/, "", s); return s }
function trim(s) { return rtrim(ltrim(s)); }
C/C++
There is no standard trim function in C or C++. Most of the available string libraries for C contain code which implements trimming, or functions that significantly ease an efficient implementation. The function has also often been called EatWhitespace in some non-standard C libraries.
In C, programmers often combine a ltrim and rtrim to implement trim:
#include <string.h>
#include <ctype.h>
void rtrim(char *str)
{
char *s;
s = str + strlen(str);
while (--s >= str) {
if (!isspace(*s)) break;
*s = 0;
}
}
void ltrim(char *str)
{
size_t n;
n = 0;
while (str[n] != '\0' && isspace((unsigned char) str[n])) {
n++;
}
memmove(str, str + n, strlen(str) - n + 1);
}
void trim(char *str)
{
rtrim(str);
ltrim(str);
}
The open source C++ library Boost has several trim variants, including a standard one:
#include <boost/algorithm/string/trim.hpp>
trimmed = boost::algorithm::trim_copy("string");
With boost's function named simply trim
the input sequence is modified in-place, and returns no result.
Another open source C++ library Qt, has several trim variants, including a standard one:
#include <QString>
trimmed = s.trimmed();
The Linux kernel also includes a strip function, strstrip()
, since 2.6.18-rc1, which trims the string "in place". Since 2.6.33-rc1, the kernel uses strim()
instead of strstrip()
to avoid false warnings.
Haskell
A trim algorithm in Haskell:
import Data.Char (isSpace)
trim :: String -> String
trim = f . f
where f = reverse . dropWhile isSpace
may be interpreted as follows: f drops the preceding whitespace, and reverses the string. f is then again applied to its own output. Note that the type signature (the second line) is optional.
J
The trim algorithm in J is a functional description:
trim =. #~ [: (+./\ *. +./\.) ' '&~:
That is: filter (#~
) for non-space characters (' '&~:
) between leading (+./\
) and (*.
) trailing (+./\.
) spaces.
JavaScript
There is a built-in trim function in JavaScript 1.8.1 (Firefox 3.5 and later), and the ECMAScript 5 standard. In earlier versions it can be added to the String object's prototype as follows:
String.prototype.trim = function() {
return this.replace(/^\s+/g, "").replace(/\s+$/g, "");
};
Perl
Perl 5 has no built-in trim function. However, the functionality is commonly achieved using regular expressions.
Example:
$string =~ s/^\s+//; # remove leading whitespace
$string =~ s/\s+$//; # remove trailing whitespace
or:
$string =~ s/^\s+|\s+$//g ; # remove both leading and trailing whitespace
These examples modify the value of the original variable $string
.
Also available for Perl is StripLTSpace in String::Strip
from CPAN.
There are, however, two functions that are commonly used to strip whitespace from the end of strings, chomp
and chop
:
chop
removes the last character from a string and returns it.chomp
removes the trailing newline character(s) from a string if present. (What constitutes a newline is $INPUT_RECORD_SEPARATOR dependent).
In Raku, the upcoming sister language of Perl, strings have a trim
method.
Example:
$string = $string.trim; # remove leading and trailing whitespace
$string .= trim; # same thing
Tcl
The Tcl string
command has three relevant subcommands: trim
, trimright
and trimleft
. For each of those commands, an additional argument may be specified: a string that represents a set of characters to remove—the default is whitespace (space, tab, newline, carriage return).
Example of trimming vowels:
set string onomatopoeia
set trimmed [string trim $string aeiou] ;# result is nomatop
set r_trimmed [string trimright $string aeiou] ;# result is onomatop
set l_trimmed [string trimleft $string aeiou] ;# result is nomatopoeia
XSLT
XSLT includes the function normalize-space(string)
which strips leading and trailing whitespace, in addition to replacing any whitespace sequence (including line breaks) with a single space.
Example:
<xsl:variable name='trimmed'>
<xsl:value-of select='normalize-space(string)'/>
</xsl:variable>
XSLT 2.0 includes regular expressions, providing another mechanism to perform string trimming.
Another XSLT technique for trimming is to utilize the XPath 2.0 substring()
function.