Comparison of programming languages (basic instructions)
This article compares a large number of programming languages by tabulating their data types, their expression, statement, and declaration syntax, and some common operating-system interfaces.
Conventions of this article
Generally, var, var, or
var is how variable names or other non-literal values to be interpreted by the reader are represented. The rest is literal code. Guillemets («
and »
) enclose optional sections.Tab ↹ indicates a necessary (whitespace) indentation.The tables are not sorted lexicographically ascending by programming language name by default, and that some languages have entries in some tables but not others.
Type identifiers
- ^a The standard constants
int shorts
andint lengths
can be used to determine how manyshort
s andlong
s can be usefully prefixed toshort int
andlong int
. The actual sizes ofshort int
,int
, andlong int
are available as the constantsshort max int
,max int
, andlong max int
etc. - ^b Commonly used for characters.
- ^c The ALGOL 68, C and C++ languages do not specify the exact width of the integer types
short
,int
,long
, and (C99, C++11)long long
, so they are implementation-dependent. In C and C++short
,long
, andlong long
types are required to be at least 16, 32, and 64 bits wide, respectively, but can be more. Theint
type is required to be at least as wide asshort
and at most as wide aslong
, and is typically the width of the word size on the processor of the machine (i.e. on a 32-bit machine it is often 32 bits wide; on 64-bit machines it is sometimes 64 bits wide). C99 and C++11 also define the[u]intN_t
exact-width types in the stdint.h header. See C syntax#Integral types for more information. In addition the typessize_t
andptrdiff_t
are defined in relation to the address size to hold unsigned and signed integers sufficiently large to handle array indices and the difference between pointers. - ^d Perl 5 does not have distinct types. Integers, floating point numbers, strings, etc. are all considered "scalars".
- ^e PHP has two arbitrary-precision libraries. The BCMath library just uses strings as datatype. The GMP library uses an internal "resource" type.
- ^f The value of
n
is provided by theSELECTED_INT_KIND
intrinsic function. - ^g ALGOL 68G's runtime option
--precision "number"
can set precision forlong long int
s to the required "number" significant digits. The standard constantslong long int width
andlong long max int
can be used to determine actual precision. - ^h COBOL allows the specification of a required precision and will automatically select an available type capable of representing the specified precision. "
PIC S9999
", for example, would require a signed variable of four decimal digits precision. If specified as a binary field, this would select a 16-bit signed type on most platforms. - ^i Smalltalk automatically chooses an appropriate representation for integral numbers. Typically, two representations are present, one for integers fitting the native word size minus any tag bit (SmallInteger) and one supporting arbitrary sized integers (LargeInteger). Arithmetic operations support polymorphic arguments and return the result in the most appropriate compact representation.
- ^j Ada range types are checked for boundary violations at run-time (as well as at compile-time for static expressions). Run-time boundary violations raise a "constraint error" exception. Ranges are not restricted to powers of two. Commonly predefined Integer subtypes are: Positive (
range 1 .. Integer'Last
) and Natural (range 0 .. Integer'Last
).Short_Short_Integer
(8 bits),Short_Integer
(16 bits) andLong_Integer
(64 bits) are also commonly predefined, but not required by the Ada standard. Runtime checks can be disabled if performance is more important than integrity checks. - ^k Ada modulo types implement modulo arithmetic in all operations, i.e. no range violations are possible. Modulos are not restricted to powers of two.
- ^l Commonly used for characters like Java's char.
- ^m
int
in PHP has the same width aslong
type in C has on that system.[c] - ^n Erlang is dynamically typed. The type identifiers are usually used to specify types of record fields and the argument and return types of functions.
- ^o When it exceeds one word.
- ^a The standard constants
real shorts
andreal lengths
can be used to determine how manyshort
s andlong
s can be usefully prefixed toshort real
andlong real
. The actual sizes ofshort real
,real
, andlong real
are available as the constantsshort max real
,max real
andlong max real
etc. With the constantsshort small real
,small real
andlong small real
available for each type's machine epsilon. - ^b declarations of single precision often are not honored
- ^c The value of
n
is provided by theSELECTED_REAL_KIND
intrinsic function. - ^d ALGOL 68G's runtime option
--precision "number"
can set precision forlong long real
s to the required "number" significant digits. The standard constantslong long real width
andlong long max real
can be used to determine actual precision. - ^e These IEEE floating-point types will be introduced in the next COBOL standard.
- ^f Same size as
double
on many implementations. - ^g Swift supports 80-bit extended precision floating point type, equivalent to
long double
in C languages.
- ^a The value of
n
is provided by theSELECTED_REAL_KIND
intrinsic function. - ^b Generic type which can be instantiated with any base floating point type.
Other variable types
- ^a specifically, strings of arbitrary length and automatically managed.
- ^b This language represents a boolean as an integer where false is represented as a value of zero and true by a non-zero value.
- ^c All values evaluate to either true or false. Everything in
TrueClass
evaluates to true and everything inFalseClass
evaluates to false. - ^d This language does not have a separate character type. Characters are represented as strings of length 1.
- ^e Enumerations in this language are algebraic types with only nullary constructors
- ^f The value of
n
is provided by theSELECTED_INT_KIND
intrinsic function.
Derived types
- ^a In most expressions (except the
sizeof
and&
operators), values of array types in C are automatically converted to a pointer of its first argument. See C syntax#Arrays for further details of syntax and pointer operations. - ^b The C-like
type x[]
works in Java, howevertype[] x
is the preferred form of array declaration. - ^c Subranges are used to define the bounds of the array.
- ^d JavaScript's array are a special kind of object.
- ^e The
DEPENDING ON
clause in COBOL does not create a true variable length array and will always allocate the maximum size of the array.
Other types
- ^a Only classes are supported.
- ^b
struct
s in C++ are actually classes, but have default public visibility and are also POD objects. C++11 extended this further, to make classes act identically to POD objects in many more cases. - ^c pair only
- ^d Although Perl doesn't have records, because Perl's type system allows different data types to be in an array, "hashes" (associative arrays) that don't have a variable index would effectively be the same as records.
- ^e Enumerations in this language are algebraic types with only nullary constructors
Variable and constant declarations
- ^a Pascal has declaration blocks. See functions.
- ^b Types are just regular objects, so you can just assign them.
- ^c In Perl, the "my" keyword scopes the variable into the block.
- ^d Technically, this does not declare name to be a mutable variable—in ML, all names can only be bound once; rather, it declares name to point to a "reference" data structure, which is a simple mutable cell. The data structure can then be read and written to using the
!
and:=
operators, respectively. - ^e If no initial value is given, an invalid value is automatically assigned (which will trigger a run-time exception if it used before a valid value has been assigned). While this behaviour can be suppressed it is recommended in the interest of predictability. If no invalid value can be found for a type (for example in case of an unconstraint integer type), a valid, yet predictable value is chosen instead.
- ^f In Rust, if no initial value is given to a
let
orlet mut
variable and it is never assigned to later, there is an "unused variable" warning. If no value is provided for aconst
orstatic
orstatic mut
variable, there is an error. There is a "non-upper-case globals" error for non-uppercaseconst
variables. After it is defined, astatic mut
variable can only be assigned to in anunsafe
block or function.
Conditional statements
- ^a A single instruction can be written on the same line following the colon. Multiple instructions are grouped together in a block which starts on a newline (The indentation is required). The conditional expression syntax does not follow this rule.
- ^b This is pattern matching and is similar to select case but not the same. It is usually used to deconstruct algebraic data types.
- ^c In languages of the Pascal family, the semicolon is not part of the statement. It is a separator between statements, not a terminator.
- ^d
END-IF
may be used instead of the period at the end. - ^e In Rust, the comma (
,
) at the end of a match arm can be omitted after the last match arm, or after any match arm in which the expression is a block (ends in possibly empty matching brackets{}
).
- ^a "
step
n" is used to change the loop interval. If "step
" is omitted, then the loop interval is 1. - ^b This implements the universal quantifier ("for all" or "") as well as the existential quantifier ("there exists" or "").
- ^c
THRU
may be used instead ofTHROUGH
. - ^d
«IS» GREATER «THAN»
may be used instead of>
. - ^e Type of set expression must implement trait
std::iter::IntoIterator
.
- ^a Common Lisp allows
with-simple-restart
,restart-case
andrestart-bind
to define restarts for use withinvoke-restart
. Unhandled conditions may cause the implementation to show a restarts menu to the user before unwinding the stack. - ^b Uncaught exceptions are propagated to the innermost dynamically enclosing execution. Exceptions are not propagated across tasks (unless these tasks are currently synchronised in a rendezvous).
Other control flow statements
See reflective programming for calling and declaring functions by strings.
- ^a Pascal requires "
forward;
" for forward declarations. - ^b Eiffel allows the specification of an application's root class and feature.
- ^c In Fortran, function/subroutine parameters are called arguments (since
PARAMETER
is a language keyword); theCALL
keyword is required for subroutines. - ^d Instead of using
"foo"
, a string variable may be used instead containing the same value.
Where string is a signed decimal number:
- ^a JavaScript only uses floating point numbers so there are some technicalities.
- ^b Perl doesn't have separate types. Strings and numbers are interchangeable.
- ^c
NUMVAL-C
orNUMVAL-F
may be used instead ofNUMVAL
. - ^
str::parse
is available to convert any type that has an implementation of thestd::str::FromStr
trait. Bothstr::parse
andFromStr::from_str
return aResult
that contains the specified type if there is no error. The turbofish (::<_>
) onstr::parse
can be omitted if the type can be inferred from context.
- ^a ALGOL 68 additionally as the "unformatted" transput routines:
read
,write
,get
, andput
. - ^b
gets(x)
andfgets(x, length, stdin)
read unformatted text from stdin. Use of gets is not recommended. - ^c
puts(x)
andfputs(x, stdout)
write unformatted text to stdout. - ^d
fputs(x, stderr)
writes unformatted text to stderr - ^eINPUT_UNIT, OUTPUT_UNIT, ERROR_UNIT are defined in theISO_FORTRAN_ENV module.
Reading command-line arguments
- ^a In Rust,
std::env::args
andstd::env::args_os
return iterators,std::env::Args
andstd::env::ArgsOs
respectively.Args
converts each argument to aString
and it panics if it reaches an argument that cannot be converted to UTF-8.ArgsOs
returns a non-lossy representation of the raw strings from the operating system (std::ffi::OsString
), which can be invalid UTF-8. - ^b In Visual Basic, command-line arguments are not separated. Separating them requires a split function
Split(string)
. - ^c The COBOL standard includes no means to access command-line arguments, but common compiler extensions to access them include defining parameters for the main program or using
ACCEPT
statements.
Execution of commands
^a Fortran 2008 or newer.