欢迎光临散文网 会员登陆 & 注册

【编译原理】C语言完整文法(包括预处理器文法)

2020-05-19 15:14 作者:GC_CH  | 我要投稿

说明:

    以下内容基于ISO C89标准。其中,一行开头的 | 表示或者的意思,[]里面的内容是可选的。不是行头的 | 都是 字符 '|' 的意思,如果文法中用到字符 [ ],则使用 '[' 和 ']'表示。

    该文法适合用来理解C语言的文法,但不适合用来写编译器。因为该文法中有大量左递归和左公因子,且是二义性的,总共有 500 行左右最重要的一点是它不是上下文无关的,所以编译原理中的LL分析法和LR分析法都派不上用场 T_T。

    也就是说,对该文法做一些特殊处理才能够用于C编译器的开发。

    该文法的开始符号是: 翻译单元 translation_unit。

    纯手打。

C89文法

一、词法

1. 单词, 预处理单词

token : 

    keyword 

    identifier

    constant

    string_literal

    operator

    punctuator


preprocessing_token : 

    header_name 

    identifier

    pp_number

    character_constant

    string_literal

    operator

    punctuator

    不在上述范围内的任一非空白符


2. 关键字

keyword:

auto     double     int         struct

break     else        long     switch

case        enum        register typedef

char     extern     return  union

const     float     short     unsigned

for     signed     void default

goto     sizeof     volatile do

if         static     while       continue


3. 标识符

identifier :

    nodigit

    identifier nodigit

    identifier digit


nodigit : 

    _ a b c d e f g h i j k l m n o p q r s t

    u v w x y z A B C D E F G H I J K L M N O

    P Q R S T U V W X Y Z


digit :

    0 1 2 3 4 5 6 7 8 9


4. 常量

constant :

    floating_constant

    | integer_constant

    | enumeration_constant

    | character_constant


floating_constant :

    fractional_constant [exponent_part] [floating_suffix]

| digit_sequence exponent_part [floating_suffix]


fractional_constant : 

[digit_sequence] . digit_sequence

    | digit_sequence


exponent_part:

    e [sign] digit_sequence

    | E [sign] digit_sequence


sign:

    + | -


digit_sequence:

    digit

    | digit_sequence digit


floating_suffix:

    f | l | F | L


integer_constant:

    decimal_constant [integer_suffix]

    | octal_contant [integer_suffix]

    | hexadecimal_constant [integer_suffix]


decimal_constant:

    nonzero_digit

    | decimal_constant digit


octal_contant:

    0

    | octal_contant octal_digit


hexadecimal_constant:

    0x hexadecimal_digit

    | 0X hexadecimal_digit

    | hexadecimal_constant hexadecimal_digit


nonzero_digit:

    1 2 3 4 5 6 7 8 9


octal_digit:

    0 1 2 3 4 5 6 7


hexadecimal_digit:

     0 1 2 3 4  5 6 7 8 9

     a b c d e f g

     A B C D E F G


integer_suffix:

    unsigned_suffix [long_suffix]

    | long_suffix [unsigned_suffix]


unsigned_suffix :

    u | U


long_suffix :

    l | L


enumeration_constant :

    identifier


character_constant:

    ' c_char_sequence '

    | L' c_char_sequence '


c_char_sequence:

    c_char

    | c_char_sequence c_char


c_char:

    源字符集中除 单引号 ',右反斜线 \,换行符 \n 外的所有字符

    | escape_sequence


escape_sequence:

    simple_escape_sequence

    | octal_escape_sequence

    | hexadecimal_escape_sequence


simple_escape_sequence: \' | \" | \? | \\ | \a | \b | \f | \n | \r | \t | \v


octal_escape_sequence:

    \ octal_digit

    | \ octal_digit octal_digit

    | \ octal_digit octal_digit octal_digit


hexadecimal_escape_sequence:

    \x hexadecimal_digit

    | hexadecimal_escape_sequence hexadecimal_digit


5. 字符串字面量


string_literal:

    " [s_char_sequence] "

    | L" [s_char_sequence] "


s_char_sequence:

    s_char

    | s_char_sequence s_char


s_char:

    源字符集中除 双引号 ",右反斜线 \,换行符 \n 外的所有字符

    | escape_sequence


6. 运算符


operator:

    [ ] ( ) . -> ++ -- & * + - ! sizeof / % << >> < > <= >= != ^ | && ||

    = *= /= %= += -= <<= >>= &= ^= |= , # ##


7. 标点符号


punctuator:

    [ ] ( ) { } * , : = ; ... #


8. 标头名


header_name:

    < h_char_sequence >

    " q_char_sequence "


h_char_sequence:

    h_char

    | h_char_sequence h_char


h_char:

    源字符集中除 换行符 \n , 大于号 > 外的所有字符


q_char_sequence:

    q_char

    | q_char_sequence q_char


q_char:

    源字符集中除 换行符 \n , 双引号" 外的所有字符


9. 预处理数字


pp_number:

    digit

    | .digit

    | pp_number digit

    | pp_number nonzero_digit

    | pp_number e sign

    | pp_number E sign

    | pp_number .


二、语法


1. 表达式


primary_expression:

    identifier

    | constant

    | string_literal

    | (expression)


postfix_expression:

    primary_expression

    | postfix_expression '[' expression ']'

    | postfix_expression ([argument_expression_list])

    | postfix_expression . identifier

    | postfix_expression -> identifier

    | postfix_expression ++

    | postfix_expression --


argument_expression_list:

    assignment_expression

    | assignmant_expression_list , assignment_expression


unary_expression:

    postfix_expression

    | ++ unary_expression

    | -- unary_expression

    | unary_operator cast_expression

    | sizeof unary_expression

    | sizeof (type_name)


unary_operator:

    & * + - ~ !


cast_expression:

    unary_expression

    | (type_name) cast_expression


multiplicative_expression:

    cast_expression

    | multiplicative_expression * cast_expression

    | multiplicative_expression / cast_expression

    | multiplicative_expression % cast_expression

addtive_expression:

    multiplicative_expression

    | addtive_expression + multiplicative_expression

    | addtive_expression - multiplicative_expression


shift_expression:

    addtive_expression

    | shift_expression << addtive_expression

    | shift_expression >> addtive_expression



relational_expression:

    shift_expression

    | relational_expression < shift_expression

    | relational_expression <= shift_expression

    | relational_expression > shift_expression

    | relational_expression >= shift_expression


equality_expression:

    relational_expression

    | equality_expression == relational_expression

    | equality_expression != relational_expression


and_expression:

    equality_expression

    | and_expression & relational_expression


exclusive_or_expression:

    and_expression

    | exclusive_or_expression ^ and_expression


inclusive_or_expression:

    exclusive_or_expression

    | inclusive_or_expression | exclusive_or_expression


logical_and_expression:

    inclusive_or_expression

    | logical_and_expression && inclusive_or_expression


logical_or_expression:

    logical_and_expression

    | logical_or_expression || logical_and_expression


conditional_expression:

    logical_or_expression 

    | logical_or_expression ? expression : conditional_expression


assignment_expression:

    conditional_expression

    | unary_expression assignment_operator assignment_expression


assignment_operator:

    = *= /= %= += -= <<= >>= &= ^= |=


expression:

    assignment_expression

    | expression assignment_expression


constant_expression:

    conditional_expression


2. 声明


declaration:

    declaration_specifier [init_declaratior_list] ;


declaration_specifier:

    storage_class_specifier [declaration_specifier]

    | type_specifer [declaration_specifier]

    | type_qualifier [declaration_specifier]


init_declaratior_list:

    init_declarator

    | init_declaratior_list , init_declarator


init_declarator:

    declarator

    | declarator = initializer


storage_class_specifier:

    typedef

    | extern

    | static

    | auto

    | register


type_specifer:

    void

    | char

    | short

    | int

    | long

    | float

    | double

    | signed

    | unsigned

    | struct_or_union_specifer

    | enum_specifier

    | typedef_name


struct_or_union_specifer:

    struct_or_union [identifier] { struct_declaration_list}

    | struct_or_union identifier


struct_or_union:

    struct

    | union


struct_declaration_list:

    struct_declaration

    | struct_declaration_list struct_declaration


struct_declaration:

    specifier_qualifier_list struct_declarator_list;


specifier_qualifier_list:

    type_specifer [specifier_qualifier_list]

    | type_qualifier [specifier_qualifier_list]


struct_declarator_list:

    struct_declarator

    | struct_declarator_list, struct_declarator


struct_declarator:

    declarator

    | [declarator] : constant_expression


enum_specifier:

    enum [identifier] {enumerator_list}

    | enum identifier


enumerator_list:

    enumerator

    | enumerator_list, enumerator


enumerator:

    enumeration_constant

    | enumeration_constant = constant_expression


enumeration_constant:

    identifier


type_qualifier:

    const

    | volatile

parameter_declaration:

    declaration_specifier declarator

    | declaration_specifier [abstract_declarator]


declarator:

    [pointer] direct_declarator


abstract_declarator:

    pointer

    | [pointer] direct_abstract_delarator


direct_declarator:

    identifier

    | (declarator)

    | direct_declarator '[' [ constant_expression ] ']'

    | direct_declarator (parameter_type_list)

    | direct_declarator ( [identifier_list] )


direct_abstract_delarator:

   (abstract_declarator)

   | [direct_abstract_delarator] '[' [constant_expression] ']'

   | [direct_abstract_delarator] ( [parameter_type_list] )


pointer:

    * [type_qualifier_list]

    | * [type_qualifier_list] pointer


type_qualifier_list:

    type_qualifier

    | type_qualifier_list type_qualifier


parameter_type_list:

    parameter_list

    | parameter_list, ...


parameter_list:

    parameter_declaration

    | parameter_list, parameter_declaration


identifier_list:

    identifier

    | identifier_list, identifier



type_name:

    specifier_qualifier_list [abstract_declarator]


specifier_qualifier_list:

    type_specifer [specifier_qualifier_list]

    | type_qualifier [specifier_qualifier_list]



typedef_name:

    identifier


initializer:

    assignment_expression

    | {initializer_list}

    | {initializer_list,}


initializer_list:

    initializer

    | initializer_list, initializer


3. 语句

statement:

    labeled_statement

    | compound_statement

    | expression_statement

    | selection_statement

    | iteration_statement

    | jump_statement


labeled_statement:

    identifier : statement

    | case constant_expression : statement

    | default : statement


compound_statement:

    {[declaration_list] [statement_list]}


declaration_list:

    declaration

    | declaration_list declaration


statement_list:

    statement

    | statement_list statement


expression_statement:

    [expression] ;


selection_statement:

    if(expression) statement

    | if(expression) statement else statement

    | switch(expression) statement


iteration_statement:

    while(expression) statement

    | do statement while(expression) ;

    | for([expression] ; [expression] ; [expression]) statement


jump_statement:

    goto identifier ;

    | continue ;

    | break ;

    | return [expression] ;


4. 外部定义


translation_unit:

    external_declaration

    | translation_unit external_declaration


external_declaration:

    functionn_definition

    | declaration


functionn_definition:

    | [declaration_specifier] declarator [declaration_list] compound_statement


6. 预处理命令


preprocessing_file:

    [group]


group:

    gruop_part

    | group gruop_part


gruop_part:

    [pp_tokens] new_line

    | if_section

    | control_line


if_section:

    if_group [elif_groups] [else_group] endif_line


if_group:

    # if constant_expression new_line [group]

    | # ifdef identifier new_line [group]

    | #ifndef identifier new_line [group]


elif_groups:

    elif_group

    | elif_groups elif_group


elif_group:

    # elif constant_expression new_line [group]


else_group:

    # else new_line [group]


endif_line:

    # endif new_line


control_line:

    # include pp_tokens new_line

    | # define identifier replacement_list new_line

    | # define identifier lparen [identifier_list] ) replacement_list new_line

    | # undef identifier new_line

    | # line pp_tokens new_line

    | # error [pp_tokens] new_line

    | # pragma [pp_tokens] new_line

    | # new_line


lparen:

    前面没有空白符的左括号(


replacement_list:

    [pp_tokens]


pp_tokens:

    preprocessing_token

    | pp_tokens preprocessing_token


new_line:

    换行符\n


【编译原理】C语言完整文法(包括预处理器文法)的评论 (共 条)

分享到微博请遵守国家法律