Comparison of programming languages (string functions)

String functions are used in computer programming languages to manipulate a string or query information about a string (some do both).

Most programming languages that have a string datatype will have some string functions although there may be other low-level ways within each language to handle strings directly. In object-oriented languages, string functions are often implemented as properties and methods of string objects. In functional and list-based languages a string is represented as a list (of character codes), therefore all list-manipulation procedures could be considered string functions. However such languages may implement a subset of explicit string-specific functions as well.

For function that manipulate strings, modern object-oriented languages, like C# and Java have immutable strings and return a copy (in newly allocated dynamic memory), while others, like C manipulate the original string unless the programmer copies data to a new string. See for example Concatenation below.

The most basic example of a string function is the length(string) function. This function returns the length of a string literal.

e.g. length("hello world") would return 11.

Other languages may have string functions with similar or exactly the same syntax or parameters or outcomes. For example, in many languages the length function is usually represented as len(string). The below list of common functions aims to help limit this confusion.

Common string functions (multi language reference)

String functions common to many languages are listed below, including the different names used. The below list of common functions aims to help programmers find the equivalent function in a language. Note, string concatenation and regular expressions are handled in separate pages. Statements in guillemets (« … ») are optional.

CharAt

{ Example in Pascal }
var
  MyStr: string = 'Hello, World';
  MyChar: Char;
begin
  MyChar := MyStr[2];          // 'e'
# Example in ALGOL 68 #
"Hello, World"[2];             // 'e'
// Example in C
#include <stdio.h>             // for printf
char MyStr[] = "Hello, World";
printf("%c", *(MyStr+1));      // 'e'
printf("%c", *(MyStr+7));      // 'W'
printf("%c", MyStr[11]);       // 'd'
printf("%s", MyStr);           // 'Hello, World'
printf("%s", "Hello(2), World(2)"); // 'Hello(2), World(2)'
// Example in C++
#include <iostream>            // for "cout"
#include <string.h>            // for "string" data type
using namespace std;
char MyStr1[] = "Hello(1), World(1)";
string MyStr2 = "Hello(2), World(2)";
cout << "Hello(3), World(3)";  // 'Hello(3), World(3)'
cout << MyStr2[6];             // '2'
cout << MyStr1.substr (5, 3);  // '(1)'
// Example in C#
"Hello, World"[2];             // 'l'
# Example in Perl 5
substr("Hello, World", 1, 1);  # 'e'
# Examples in Python
"Hello, World"[2]              #  'l'
"Hello, World"[-3]             #  'r'
# Example in Raku
"Hello, World".substr(1, 1);   # 'e'
' Example in Visual Basic
Mid("Hello, World",2,1)
' Example in Visual Basic .NET
"Hello, World".Chars(2)    '  "l"c
" Example in Smalltalk "
'Hello, World' at: 2.        "$e"
//Example in Rust
"Hello, World".chars().nth(2);   // Some('l')

Compare (integer result)

# Example in Perl 5
"hello" cmp "world";       # returns -1
# Example in Python
cmp("hello", "world")      # returns -1
# Examples in Raku
"hello" cmp "world";       # returns Less
"world" cmp "hello";       # returns More
"hello" cmp "hello";       # returns Same
/** Example in Rexx */
compare("hello", "world")  /* returns index of mismatch: 1 */
; Example in Scheme
(use-modules (srfi srfi-13))
; returns index of mismatch: 0
(string-compare "hello" "world" values values values)

Compare (relational operator-based, Boolean result)

% Example in Erlang
"hello" > "world".            % returns false
# Example in Raku
"art" gt "painting";           # returns False
"art" lt "painting";           # returns True
# Example in Windows PowerShell
"hello" -gt "world"           # returns false
;; Example in Common Lisp
(string> "art" "painting")      ; returns nil
(string< "art" "painting")      ; returns non nil

Concatenation

{ Example in Pascal }
'abc' + 'def';      // returns "abcdef"
// Example in C#
"abc" + "def";      // returns "abcdef"
' Example in Visual Basic
"abc" & "def"       '  returns "abcdef"
"abc" + "def"       '  returns "abcdef"
"abc" & Null        '  returns "abc"
"abc" + Null        '  returns Null
// Example in D
"abc" ~ "def";      // returns "abcdef"
;; Example in common lisp
(concatenate 'string "abc " "def " "ghi")  ; returns "abc def ghi"
# Example in Perl 5
"abc" . "def";      # returns "abcdef"
"Perl " . 5;        # returns "Perl 5"
# Example in Raku
"abc" ~ "def";      # returns "abcdef"
"Perl " ~ 6;        # returns "Perl 6"

Contains

¢ Example in ALGOL 68 ¢
string in string("e", loc int, "Hello mate");      ¢ returns true ¢
string in string("z", loc int, "word");            ¢ returns false ¢
// Example In C#
"Hello mate".Contains("e");      // returns true
"word".Contains("z");            // returns false
#  Example in Python
"e" in "Hello mate"              #  returns true
"z" in "word"                    #  returns false
#  Example in Raku
"Good morning!".contains('z')    #  returns False
"¡Buenos días!".contains('í');   #  returns True
"  Example in Smalltalk "
'Hello mate' includesSubstring: 'e'  " returns true "
'word' includesSubstring: 'z'        " returns false "

Equality

Tests if two strings are equal. See also #Compare and #Compare. Note that doing equality checks via a generic Compare with integer result is not only confusing for the programmer but is often a significantly more expensive operation; this is especially true when using "C-strings".

// Example in C#
"hello" == "world"           // returns false
' Example in Visual Basic
"hello" = "world"            '  returns false
# Examples in Perl 5
'hello' eq 'world'           # returns 0
'hello' eq 'hello'           # returns 1
# Examples in Raku
'hello' eq 'world'           # returns False
'hello' eq 'hello'           # returns True
# Example in Windows PowerShell
"hello" -eq "world"          #  returns false
⍝ Example in APL
'hello'  'world'          ⍝  returns 0


Find

Examples

  • Common Lisp
    (search "e" "Hello mate")             ;  returns 1
    (search "z" "word")                   ;  returns NIL
    
  • C#
    "Hello mate".IndexOf("e");            // returns 1
    "Hello mate".IndexOf("e", 4);         // returns 9
    "word".IndexOf("z");                  // returns -1
    
  • Raku
    "Hello, there!".index('e')           # returns 1
    "Hello, there!".index('z')           # returns Nil
    
  • Scheme
    (use-modules (srfi srfi-13))
    (string-contains "Hello mate" "e")    ;  returns 1
    (string-contains "word" "z")          ;  returns #f
    
  • Visual Basic
    ' Examples in
    InStr("Hello mate", "e")              '  returns 2
    InStr(5, "Hello mate", "e")           '  returns 10
    InStr("word", "z")                    '  returns 0
    
  • Smalltalk
    'Hello mate' indexOfSubCollection:'ate'  "returns 8"
    
    'Hello mate' indexOfSubCollection:'late' "returns 0"
    
    I'Hello mate'
        indexOfSubCollection:'late'
        ifAbsent:[ 99 ]                      "returns 99"
    
    'Hello mate'
        indexOfSubCollection:'late'
        ifAbsent:[ self error ]              "raises an exception"
    


Find character

// Examples in C#
"Hello mate".IndexOf('e');              // returns 1
"word".IndexOf('z')                     // returns -1
; Examples in Common Lisp
(position #\e "Hello mate")             ;  returns 1
(position #\z "word")                   ;  returns NIL

^a Given a set of characters, SCAN returns the position of the first character found, while VERIFY returns the position of the first character that does not belong to the set.

Format

// Example in C#
String.Format("My {0} costs {1:C2}", "pen", 19.99); // returns "My pen costs $19.99"
// Example in Object Pascal (Delphi)
Format('My %s costs $%2f', ['pen', 19.99]);         // returns "My pen costs $19.99"
// Example in Java
String.format("My %s costs $%2f", "pen", 19.99);    // returns "My pen costs $19.99"
# Examples in Raku
sprintf "My %s costs \$%.2f", "pen", 19.99;          # returns "My pen costs $19.99"
1.fmt("%04d");                                       # returns "0001"
# Example in Python
"My %s costs $%.2f" % ("pen", 19.99);                #  returns "My pen costs $19.99"
"My {0} costs ${1:.2f}".format("pen", 19.99);        #  returns "My pen costs $19.99"
#Example in Python 3.6+
pen = "pen"
f"My {pen} costs {19.99}"                                          #returns "My pen costs 19.99"
; Example in Scheme
(format "My ~a costs $~1,2F" "pen" 19.99)           ;  returns "My pen costs $19.99"
/* example in PL/I */
put string(some_string) edit('My ', 'pen', ' costs', 19.99)(a,a,a,p'$$$V.99')
/* returns "My pen costs $19.99" */

Inequality

Tests if two strings are not equal. See also #Equality.

// Example in C#
"hello" != "world"    // returns true
' Example in Visual Basic
"hello" <> "world"    '  returns true
;; Example in Clojure
(not= "hello" "world")  ; ⇒ true
# Example in Perl 5
'hello' ne 'world'      # returns 1
# Example in Raku
'hello' ne 'world'      # returns True
# Example in Windows PowerShell
"hello" -ne "world"   #  returns true

index

see #Find

indexof

see #Find

instr

see #Find

instrrev

see #rfind

join

// Example in C#
String.Join("-", {"a", "b", "c"})  // "a-b-c"
" Example in Smalltalk "
#('a' 'b' 'c') joinUsing: '-'      " 'a-b-c' "
# Example in Perl 5
join( '-', ('a', 'b', 'c'));       # 'a-b-c'
# Example in Raku
<a b c>.join('-');       # 'a-b-c'
# Example in Python
"-".join(["a", "b", "c"])          #  'a-b-c'
# Example in Ruby
["a", "b", "c"].join("-")          #  'a-b-c'
; Example in Scheme
(use-modules (srfi srfi-13))
(string-join '("a" "b" "c") "-")   ;  "a-b-c"

lastindexof

see #rfind

left

# Example in Raku
"Hello, there!".substr(0, 6);  # returns "Hello,"
/* Examples in Rexx */
left("abcde", 3)         /* returns "abc"      */
left("abcde", 8)         /* returns "abcde   " */
left("abcde", 8, "*")    /* returns "abcde***" */
; Examples in Scheme
(use-modules (srfi srfi-13))
(string-take "abcde", 3) ;  returns "abc"
(string-take "abcde", 8) ;  error
' Examples in Visual Basic
Left("sandroguidi", 3)   '  returns "san"
Left("sandroguidi", 100) '  returns "sandroguidi"


len

see #length


length

// Examples in C#
"hello".Length;      // returns 5
"".Length;           // returns 0
# Examples in Erlang
string:len("hello"). %  returns 5
string:len("").      %  returns 0
# Examples in Perl 5
length("hello");     #  returns 5
length("");          #  returns 0
# Examples in Raku
"🏳️‍🌈".chars; chars "🏳️‍🌈";      # both return 1
"🏳️‍🌈".codes; codes "🏳️‍🌈";      # both return 4
"".chars; chars "";          # both return 0
"".codes; codes "";          # both return 0
' Examples in Visual Basic
Len("hello")         '  returns 5
Len("")              '  returns 0
//Examples in Objective-C
[@"hello" Length]   //returns 5
[@"" Length]   //returns 0
-- Examples in Lua
("hello"):len() -- returns 5
#"" -- returns 0

locate

see #Find


Lowercase

// Example in C#
"Wiki means fast?".ToLower();        // "wiki means fast?"
; Example in Scheme
(use-modules (srfi srfi-13))
(string-downcase "Wiki means fast?") ;  "wiki means fast?"
/* Example in C */
#include <ctype.h>
#include <stdio.h>
int main(void) {
    char string[] = "Wiki means fast?";
    int i;
    for (i = 0; i < sizeof(string) - 1; ++i) {
        /* transform characters in place, one by one */
        string[i] = tolower(string[i]);
    }
    puts(string);                       /* "wiki means fast?" */
    return 0;
}
# Example in Raku
"Wiki means fast?".lc;             # "wiki means fast?"


mid

see #substring


partition

# Examples in Python
"Spam eggs spam spam and ham".partition('spam')   # ('Spam eggs ', 'spam', ' spam and ham')
"Spam eggs spam spam and ham".partition('X')      # ('Spam eggs spam spam and ham', "", "")
# Examples in Perl 5 / Raku
split /(spam)/, 'Spam eggs spam spam and ham' ,2;   # ('Spam eggs ', 'spam', ' spam and ham');
split /(X)/, 'Spam eggs spam spam and ham' ,2;      # ('Spam eggs spam spam and ham');


replace

// Examples in C#
"effffff".Replace("f", "jump");     // returns "ejumpjumpjumpjumpjumpjump"
"blah".Replace("z", "y");           // returns "blah"
// Examples in Java
"effffff".replace("f", "jump");     // returns "ejumpjumpjumpjumpjumpjump"
"effffff".replaceAll("f*", "jump"); // returns "ejump"
// Examples in Raku
"effffff".subst("f", "jump", :g);    # returns "ejumpjumpjumpjumpjumpjump"
"blah".subst("z", "y", :g);          # returns "blah"
' Examples in Visual Basic
Replace("effffff", "f", "jump")     '  returns "ejumpjumpjumpjumpjumpjump"
Replace("blah", "z", "y")           '  returns "blah"
# Examples in Windows PowerShell
"effffff" -replace "f", "jump"      #  returns "ejumpjumpjumpjumpjumpjump"
"effffff" -replace "f*", "jump"     #  returns "ejump"

reverse

" Example in Smalltalk "
'hello' reversed             " returns 'olleh' "
# Example in Perl 5
reverse "hello"              # returns "olleh"
# Example in Raku
"hello".flip                 # returns "olleh"
# Example in Python
"hello"[::-1]                # returns "olleh"
; Example in Scheme
(use-modules (srfi srfi-13))
(string-reverse "hello")     ; returns "olleh"

rfind

; Examples in Common Lisp
(search "e" "Hello mate" :from-end t)     ;  returns 9
(search "z" "word" :from-end t)           ;  returns NIL
// Examples in C#
"Hello mate".LastIndexOf("e");           // returns 9
"Hello mate".LastIndexOf("e", 4);        // returns 1
"word".LastIndexOf("z");                 // returns -1
# Examples in Perl 5
rindex("Hello mate", "e");               # returns 9
rindex("Hello mate", "e", 4);            # returns 1
rindex("word", "z");                     # returns -1
# Examples in Raku
"Hello mate".rindex("e");                # returns 9
"Hello mate".rindex("e", 4);             # returns 1
"word".rindex('z');                      # returns Nil
' Examples in Visual Basic
InStrRev("Hello mate", "e")              '  returns 10
InStrRev(5, "Hello mate", "e")           '  returns 2
InStrRev("word", "z")                    '  returns 0


// Examples in Java; extract rightmost 4 characters
String str = "CarDoor";
str.substring(str.length()-4); // returns 'Door'
# Examples in Raku
"abcde".substr(*-3);          # returns "cde"
"abcde".substr(*-8);          # 'out of range' error
/* Examples in Rexx */
right("abcde", 3)              /* returns "cde"      */
right("abcde", 8)              /* returns "   abcde" */
right("abcde", 8, "*")         /* returns "***abcde" */
; Examples in Scheme
(use-modules (srfi srfi-13))
(string-take-right "abcde", 3) ;  returns "cde"
(string-take-right "abcde", 8) ;  error
' Examples in Visual Basic
Right("sandroguidi", 3)        '  returns "idi"
Right("sandroguidi", 100)      '  returns "sandroguidi"


rpartition

# Examples in Python
"Spam eggs spam spam and ham".rpartition('spam')  ### ('Spam eggs spam ', 'spam', ' and ham')
"Spam eggs spam spam and ham".rpartition('X')     ### ("", "", 'Spam eggs spam spam and ham')

slice

see #substring


split

// Example in C#
"abc,defgh,ijk".Split(',');                 // {"abc", "defgh", "ijk"}
"abc,defgh;ijk".Split(',', ';');            // {"abc", "defgh", "ijk"}
% Example in Erlang
string:tokens("abc;defgh;ijk", ";").        %  ["abc", "defgh", "ijk"]
// Examples in Java
"abc,defgh,ijk".split(",");                 // {"abc", "defgh", "ijk"}
"abc,defgh;ijk".split(",|;");               // {"abc", "defgh", "ijk"}
{ Example in Pascal }
var
  lStrings: TStringList;
  lStr: string;
begin
  lStrings := TStringList.Create;
  lStrings.Delimiter := ',';
  lStrings.DelimitedText := 'abc,defgh,ijk';
  lStr := lStrings.Strings[0]; // 'abc'
  lStr := lStrings.Strings[1]; // 'defgh'
  lStr := lStrings.Strings[2]; // 'ijk'
end;
# Examples in Perl 5
split(/spam/, 'Spam eggs spam spam and ham'); # ('Spam eggs ', ' ', ' and ham')
split(/X/, 'Spam eggs spam spam and ham');    # ('Spam eggs spam spam and ham')
# Examples in Raku
'Spam eggs spam spam and ham'.split(/spam/);  # (Spam eggs     and ham)
split(/X/, 'Spam eggs spam spam and ham');    # (Spam eggs spam spam and ham)


sprintf

see #Format

strip

see #trim


strcmp

see #Compare (integer result)


substring

// Examples in C#
"abc".Substring(1, 1):      // returns "b"
"abc".Substring(1, 2);      // returns "bc"
"abc".Substring(1, 6);      // error
;; Examples in Common Lisp
(subseq "abc" 1 2)          ; returns "b"
(subseq "abc" 2)            ; returns "c"
% Examples in Erlang
string:substr("abc", 2, 1). %  returns "b"
string:substr("abc", 2).    %  returns "bc"
# Examples in Perl 5
substr("abc", 1, 1);       #  returns "b"
substr("abc", 1);          #  returns "bc"
# Examples in Raku
"abc".substr(1, 1);        #  returns "b"
"abc".substr(1);           #  returns "bc"
# Examples in Python
"abc"[1:2]                 #  returns "b"
"abc"[1:3]                 #  returns "bc"
/* Examples in Rexx */
substr("abc", 2, 1)         /* returns "b"      */
substr("abc", 2)            /* returns "bc"     */
substr("abc", 2, 6)         /* returns "bc    " */
substr("abc", 2, 6, "*")    /* returns "bc****" */


Uppercase

// Example in C#
"Wiki means fast?".ToUpper();      // "WIKI MEANS FAST?"
# Example in Perl 5
uc("Wiki means fast?");             # "WIKI MEANS FAST?"
# Example in Raku
uc("Wiki means fast?");             # "WIKI MEANS FAST?"
"Wiki means fast?".uc;              # "WIKI MEANS FAST?"
/* Example in Rexx */
translate("Wiki means fast?")      /* "WIKI MEANS FAST?" */

/* Example #2 */
A='This is an example.'
UPPER A                            /* "THIS IS AN EXAMPLE." */

/* Example #3 */
A='upper using Translate Function.'
Translate UPPER VAR A Z            /* Z="UPPER USING TRANSLATE FUNCTION." */
; Example in Scheme
(use-modules (srfi srfi-13))
(string-upcase "Wiki means fast?") ;  "WIKI MEANS FAST?"
' Example in Visual Basic
UCase("Wiki means fast?")          '  "WIKI MEANS FAST?"

trim

trim or strip is used to remove whitespace from the beginning, end, or both beginning and end, of a string.

Other languages

In languages without a built-in trim function, it is usually simple to create a custom function which accomplishes the same task.

APL

APL can use regular expressions directly:

Trim'^ +| +$'⎕R''

Alternatively, a functional approach combining Boolean masks that filter away leading and trailing spaces:

Trim{/⍨(\⌽∨\∘)' '}

Or reverse and remove leading spaces, twice:

Trim{(\' ')/}2

AWK

In AWK, one can use regular expressions to trim:

 ltrim(v) = gsub(/^[ \t]+/, "", v)
 rtrim(v) = gsub(/[ \t]+$/, "", v)
 trim(v)  = ltrim(v); rtrim(v)

or:

 function ltrim(s) { sub(/^[ \t]+/, "", s); return s }
 function rtrim(s) { sub(/[ \t]+$/, "", s); return s }
 function trim(s)  { return rtrim(ltrim(s)); }

C/C++

There is no standard trim function in C or C++. Most of the available string libraries for C contain code which implements trimming, or functions that significantly ease an efficient implementation. The function has also often been called EatWhitespace in some non-standard C libraries.

In C, programmers often combine a ltrim and rtrim to implement trim:

#include <string.h>
#include <ctype.h>

void rtrim(char *str)
{
  char *s;
  s = str + strlen(str);
  while (--s >= str) {
    if (!isspace(*s)) break;
    *s = 0;
  }
}

void ltrim(char *str)
{
  size_t n;
  n = 0;
  while (str[n] != '\0' && isspace((unsigned char) str[n])) {
    n++;
  }
  memmove(str, str + n, strlen(str) - n + 1);
}

void trim(char *str)
{
  rtrim(str);
  ltrim(str);
}

The open source C++ library Boost has several trim variants, including a standard one:

#include <boost/algorithm/string/trim.hpp>
trimmed = boost::algorithm::trim_copy("string");

With boost's function named simply trim the input sequence is modified in-place, and returns no result.

Another open source C++ library Qt, has several trim variants, including a standard one:

#include <QString>
trimmed = s.trimmed();

The Linux kernel also includes a strip function, strstrip(), since 2.6.18-rc1, which trims the string "in place". Since 2.6.33-rc1, the kernel uses strim() instead of strstrip() to avoid false warnings.

Haskell

A trim algorithm in Haskell:

 import Data.Char (isSpace)
 trim      :: String -> String
 trim      = f . f
    where f = reverse . dropWhile isSpace

may be interpreted as follows: f drops the preceding whitespace, and reverses the string. f is then again applied to its own output. Note that the type signature (the second line) is optional.

J

The trim algorithm in J is a functional description:

     trim =. #~ [: (+./\ *. +./\.) ' '&~:

That is: filter (#~) for non-space characters (' '&~:) between leading (+./\) and (*.) trailing (+./\.) spaces.

JavaScript

There is a built-in trim function in JavaScript 1.8.1 (Firefox 3.5 and later), and the ECMAScript 5 standard. In earlier versions it can be added to the String object's prototype as follows:

String.prototype.trim = function() {
  return this.replace(/^\s+/g, "").replace(/\s+$/g, "");
};

Perl

Perl 5 has no built-in trim function. However, the functionality is commonly achieved using regular expressions.

Example:

$string =~ s/^\s+//;            # remove leading whitespace
$string =~ s/\s+$//;            # remove trailing whitespace

or:

$string =~ s/^\s+|\s+$//g ;     # remove both leading and trailing whitespace

These examples modify the value of the original variable $string.

Also available for Perl is StripLTSpace in String::Strip from CPAN.

There are, however, two functions that are commonly used to strip whitespace from the end of strings, chomp and chop:

  • chop removes the last character from a string and returns it.
  • chomp removes the trailing newline character(s) from a string if present. (What constitutes a newline is $INPUT_RECORD_SEPARATOR dependent).

In Raku, the upcoming sister language of Perl, strings have a trim method.

Example:

$string = $string.trim;     # remove leading and trailing whitespace
$string .= trim;            # same thing

Tcl

The Tcl string command has three relevant subcommands: trim, trimright and trimleft. For each of those commands, an additional argument may be specified: a string that represents a set of characters to remove—the default is whitespace (space, tab, newline, carriage return).

Example of trimming vowels:

set string onomatopoeia
set trimmed [string trim $string aeiou]         ;# result is nomatop
set r_trimmed [string trimright $string aeiou]  ;# result is onomatop
set l_trimmed [string trimleft $string aeiou]   ;# result is nomatopoeia

XSLT

XSLT includes the function normalize-space(string) which strips leading and trailing whitespace, in addition to replacing any whitespace sequence (including line breaks) with a single space.

Example:

<xsl:variable name='trimmed'>
   <xsl:value-of select='normalize-space(string)'/>
</xsl:variable>

XSLT 2.0 includes regular expressions, providing another mechanism to perform string trimming.

Another XSLT technique for trimming is to utilize the XPath 2.0 substring() function.

References

Uses material from the Wikipedia article Comparison of programming languages (string functions), released under the CC BY-SA 4.0 license.