Alphabets and Tokens in C

Learning C Program :

Basically the steps to learn C language is exactly similar to the steps that we follow to learn natural languages like English. The sequence of steps to learn C Language can be summarized as follows:

Alphabet->Tokens->Statements->Program

Here are some general steps you can follow to learn the C programming language:

  1. Learn the basics: Start by learning the fundamental concepts of C programming, including data types, variables, operators, control structures, functions, and arrays. There are many online resources and books available to help you learn the basics of C.
  2. Practice writing code: Once you have a basic understanding of the concepts, start practicing writing code in C. Start with simple programs, such as a program that prints “Hello, World!” to the console. As you become more comfortable with the language, try writing more complex programs.
  3. Read code written by others: Reading code written by other programmers can be a great way to learn new techniques and gain insight into how other programmers approach problems. Look for open-source C projects on GitHub or other online repositories and try to understand how they work.
  4. Get feedback: Getting feedback from other programmers can help you identify areas where you need to improve. Join online programming communities, attend coding meetups or workshops, or find a mentor who can help you improve your coding skills.
  5. Practice regularly: Like any skill, learning to program takes practice. Set aside regular time to practice writing code and experimenting with different techniques. The more you practice, the better you will become.
  6. Build projects: Once you have a good grasp of the fundamentals of C programming, start building projects. Choose projects that interest you, such as building a simple game or creating a command-line tool. Building projects will help you apply what you’ve learned and gain practical experience.

Remember that learning to program takes time and effort. Be patient with yourself, don’t be afraid to make mistakes, and keep practicing. With persistence and dedication, you can become a skilled C programmer.

Building Blocks of C Program:

The fundamental building blocks of the C programming language include:

  1. Alphabet (Character Set): The alphabet, or character set, in C consists of all the characters that can be used to write C code, including letters, digits, punctuation marks, and other special characters.
  2. Tokens: A token in C is the smallest unit of a program that the compiler can recognize and process. Tokens include keywords, identifiers, operators, constants, and punctuations.
  3. Data Types: In C, a data type specifies the type of data that a variable can hold. Common data types in C include int, float, double, char, and void. Each data type has a specific size and range of values.
  4. Variables: A variable in C is a named storage location in memory used to hold a value of a specific data type. Before using a variable in C, it must be declared with its data type.
  5. Constants: Constants in C are values that do not change during the execution of a program. There are two types of constants in C: numeric constants (integer or floating-point values) and character constants (single characters or strings of characters enclosed in double quotes).

These building blocks are essential to understanding how the C programming language works and how to write effective and efficient code in C.

The fundamental concepts of C  Language are : Alphabet (Character Set), Tokens, Data Types, Variable and Constants.

Alphabets

A symbol that is used while writing a program is called a character or an alphabet. A character may be letter or digit or any special symbol. The character set of C programming language is a subset of ASCII (American Standard Code for Information Interchange), which includes 128 characters that are represented by a unique integer value ranging from 0 to 127. These characters include:

  • Uppercase letters (A-Z)
  • Lowercase letters (a-z)
  • Digits (0-9)
  • Special Characters / Basic punctuation marks (., ;, :, !, ?, -, etc.)
  • Space character
  • Control characters (such as newline, tab, and carriage return)
  • Delimiters

Type  Symbols or Character Set
Lettersa b c d e f g h i j k l m n o p q r s t u v w x y z
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Digits0 1 2 3 4 5 6 7 8 9
Symbols /Special Characters !  ”  #  %  &  ‘  (  )  *  +  ,  –  .  /   :  ;  <  =  > ?  [ \ ]  ^  _ {  |  } ~
White Spacesform feed, newline(\n), space, horizontal tab (\t) and vertical tab( \v)
Delimiters:  ;    (  )    [   ]    {   }     #    ,
Alphabets of C

In addition to the standard ASCII characters, the extended ASCII character set is also commonly used in C programming, which includes 256 characters. The extended ASCII character set includes additional special characters, symbols, and international characters that are used in different languages.

It’s worth noting that the exact characters that can be used as identifiers and variable names in C programming may depend on the specific compiler or platform being used, as well as any extensions or libraries being used in the program. However, in general, the standard ASCII character set is used for identifiers and variable names in C programming.

Tokens

A token is a smallest or basic unit of a C program. They are the basic buildings blocks (that resemble a word in English, a pada in kannada) of C language which are used to write a C program. There are six categories of tokens in C as given below:

  1. Keywords               (eg: int, while),
  2. Identifiers               (eg: main, total),
  3. Constants                (eg: 10, 20),
  4. Strings                    (eg: “total”, “hello”),
  5. Special symbols      (eg: (), {}),
  6. Operators                (eg: +, /,-,*)

Consider a simple C program as given below:

int main()
{ 
     int   x,y,total;
      x =10, y = 20;
     total = x + y;
     printf(“Total =%d\n”,total);
}
Following are the different tokens used in above program
•	main – identifier
•	{,}, (,) – delimiter or special symbols.
•	int – keyword
•	x, y, total – identifier
•	main, {, }, (, ), int, x, y, total – tokens

Following section discusses in detail the different category of tokens

a. Keywords:

It is a reserved word with predefined meaning in C. There are 32 keywords in C. Each keyword has fixed meaning that cannot be changed by user. For example in the statement: int money; int is a keyword, that indicates money is of type integer variable. As, C programming is case sensitive, all keywords must be written in lowercase. The table below gives the list of all keywords predefined by ANSI C.

Description of Keywords

  1. auto : Defines a local variable as having a local lifetime.
  2. break : Passes control out of the compound statement.
  3. case : Used in switch statement to represent the branch point containing the compound statement .
  4. char : Represents the data type that holds characters
  5. const : Makes variable value or pointer parameter unmodifiable
  6. continue : Passes control to the beginning of the loop
  7. default : Specifies the default block of code in a switch statement.
  8. do : Keyword do is usually used together with while to make another form of repeating statement. It starts a do while loop.
  9. double : Represents a double precision floating point data type.
  10.  else : Indicates an alternative branch in the if else statement .
  11.  enum : Defines a set of constants of type int
  12.  extern : Indicates that an identifier is defined elsewhere
  13.  float : The keyword float usually represents a single precision  floating point data type.
  14.  for : Used in for loop to provide iteration facility repeatedly.
  15.  goto : Used to jump the control unconditionally from one part to another.
  16.  if : Used to execute the statements conditionally.
  17.  int :Refers to a fundamental data type that holds integer type values.
  18.  long : Modifier used to hold long type values along with basic data types.
  19.  register :Tells the compiler to store the variable being declared in a CPU register
  20.  return : Exits immediately from the currently executing function to the calling routine, optionally returning a value.
  21.  short : Modifier used to represent the short type values of  basic data types.
  22.  signed : Modifier that holds the signed type values of  a data type .
  23.  sizeof : Returns the size of a specified parameter.
  24.  static : Preserves variable value to survive after its scope ends
  25.  struct : Groups variables into a single record
  26.  switch :Multiple branching statement which  causes control to branch to one of a list of possible statements in the block of statements .
  27.  typedef : Assigns new name to data type definition.
  28.  union : Groups the variables sharing the same storage space.
  29.  unsigned: modifier used to represent the unsigned type values of a data type.
  30.  void : represents the empty data type.
  31.  volatile : Indicates that a variable can be changed by a background routine.
  32.  while : Repeats execution of statements while the condition is true.

b. Identifiers:

In C programming language, identifiers are names given to C entities, such as variables, functions, structures etc. Identifier is created to give unique name to C entities to identify it during the execution of program. For example:

float  avg, height;
int      no_stud, tot_marks;

Here, avg and height areidentifiers which denotes a variables of type float. Similarly, no_stud and tot_marks are identifiers, which denotes the variables of type integer.

Rules for writing identifier

  1. An identifier can be composed of letters (both uppercase and lowercase letters), digits and underscore ‘_’ only.
  2. The first letter of identifier should be either a letter or an underscore.
  3. The identifier must not include keywords or reserved words.
  4. The length of identifier may be 63 for local entities and 31 for global entities.

Tips for Good Programming Practice:  Programmer can choose the name of identifier whatever they want. However, if the programmer choose meaningful name for an identifier, it will be easy to understand and work on, particularly in  case of large program

c. Constants:

Constants are the terms that can’t be changed during the execution of a program. For example: 1, 2.5, “Programming is easy.” etc. In C, constants can be classified as: character constant , integer constant , floating point constant and string constant.

d. Operators:

In C programming language, operators are special symbols or characters that perform specific operations on one or more operands (values/variable). For example: + is an operator to perform addition. Ex:+, -, *,/,&,^,&&,||,—-etc.

Here are some common operators in C:

  1. Arithmetic operators: perform basic mathematical operations like addition, subtraction, multiplication, division, and modulus.
int a = 10, b = 3;
int c = a + b; // c = 13
int d = a - b; // d = 7
int e = a * b; // e = 30
int f = a / b; // f = 3
int g = a % b; // g = 1

Relational operators: compare two values and return a boolean value (1 or 0) based on the result.

int a = 10, b = 3;
int c = (a == b); // c = 0 (false)
int d = (a != b); // d = 1 (true)
int e = (a > b); // e = 1 (true)
int f = (a < b); // f = 0 (false)
int g = (a >= b); // g = 1 (true)
int h = (a <= b); // h = 0 (false)

3. Logical operators: perform logical operations on boolean values and return a boolean value.

int a = 10, b = 3;
int c = (a > 5 && b < 10); // c = 1 (true)
int d = (a > 5 || b > 10); // d = 1 (true)
int e = !(a > 5); // e = 0 (false)

Bitwise operators: perform operations on the binary representation of values.

int a = 10, b = 3;
int c = a & b; // c = 2 (0010 & 0011 = 0010)
int d = a | b; // d = 11 (1010 | 0011 = 1011)
int e = a ^ b; // e = 9 (1010 ^ 0011 = 1001)
int f = ~a; // f = -11 (~00001010 = 11110101)
int g = a << 1; // g = 20 (00001010 << 1 = 00010100)
int h = a >> 1; // h = 5 (00001010 >> 1 = 00000101)

Assignment operators: assign values to variables and perform arithmetic or bitwise operations at the same time.

int a = 10, b = 3;
a += b; // equivalent to a = a + b; (a = 13)
a -= b; // equivalent to a = a - b; (a = 10)
a *= b; // equivalent to a = a * b; (a = 30)
a /= b; // equivalent to a = a / b; (a = 10)
a %= b; // equivalent to a = a % b; (a = 1)
a &= b; // equivalent to a = a & b; (a = 0)
a |= b; // equivalent to a = a | b; (a = 3)
a ^= b; // equivalent to a = a ^ b; (a = 0)
a <<= 1; // equivalent to a = a << 1; (a = 0)
a >>= 1; // equivalent to a = a >> 1; (a = 0)

e. Special Symbols:

The following special symbols are used in C having special meaning and thus cannot be used for other purpose: { } , [ ],( ) , etc.

Braces{ }: These opening and ending curly braces marks the start and end of a block of code containing more than one executable statement.

Parentheses( ): These special symbols are used to indicate function calls and function parameters.

Brackets[ ]: Opening and closing brackets are used as array element reference. These indicate single and  multidimensional subscripts.

g. Strings

In C programming language, a string is a sequence of characters terminated by a null character ‘\0’. It is stored as an array of characters. Here is an example of a string declaration and initialization:

char str[] = "Hello, world!";

This creates a character array str with enough memory to hold the string “Hello, world!” and adds the null terminator at the end automatically. Here are some other examples of strings:

char str1[] = "This is a string.";
char str2[] = "12345";
char str3[] = "";

In the first example, str1 is a string of characters that reads “This is a string.”. In the second example, str2 is a string of characters that reads “12345”. In the third example, str3 is an empty string, which consists only of the null character ‘\0’.