77 Programming Language Q&A (P4)

Programming Language Concepts

The following article deals with questions/answers you may encounter when asked about Lexical and Syntax Analysis. Check the bottom of the page for links to the other questions and answers I’ve come up with to make you a great Computer Scientist (when it comes to Programming Languages).

324. What are the 3 approaches to implementing programming languages?

– Compilation, Pure Interpretation and Hybrid Implementation

325. What is the job of the syntax analyzer?

– Check the syntax of the program and create a parse tree.

326. What are the syntax analyzers based on?

– Formal description of the syntax of programs, usually BNF.

327. What are some of the advantages of using BNF?

– Descriptions are clear and concise.
– Syntax analyzers can be generated directly from BNF.
– Implementations based on BNF are easy to maintain.

328. What are the 2 distinct parts of syntax analysis and what do they do?

– Lexical analysis: deals with small-scale language constructs such as name
– Syntax analyzer: deals with large-scale constructs such as expressions

329. What are the 3 reasons for why lexical analysis is separated from syntax analysis?

– Simplicity, Efficiency and Portability

330. What does the lexical analyzer do?

– Collects input characters into groups (lexemes) and assigns an internal code (token) to each group.

331. How are lexemes recognized?

– By matching the input against patterns.

332. What are some other tasks that are performed by the lexical analyzer?

– Skipping comments and white spaces between lexemes.
– Inserting lexemes for user-defined names into the symbol table.
– Detecting syntactic errors in tokens.

333. Name the three approaches that can be used to build a lexical analyzer.

– Write a formal description of the token patterns of the language and use a software tool to automatically generate a lexical analyzer.
– Design a state transition diagram that describes the token patterns of the language and write a program that implements the diagram.
– Design a state transition diagram that describes the token patterns of the language and hand-construct a table-driven implementation of the state diagram.

334. A state transition diagram, or _________ diagram, is a directed graph.

– state

335. A state transition diagram, or state diagram, is a _______ graph.

– directed

336. Describe a state diagram.

– Nodes are labeled with state names
– The arcs are labeled with input characters
– An arc may also include actions to be done when transition is taken.

337. Syntax analysis is often referred to as _________.

– parsing

338. What are the responsibilities of a syntax analyzer?

– Determine if input program is syntactically correct.
– Produce a parse tree.

339. What must a parser do when an error is found?

– Produce a diagnostic message and recover.

340. Why is recovery used?

– So that the compiler finds as many errors as possible as quickly as possible.

341. How are parsers categorized?

– According to the direction in which they build parse trees.

342. What are the two different parser categories?

– Top-down
– Bottom-up

343. Describe a top-down parser.

– Builds a tree from thee root down to the leaves

344. Describe a bottom-up parser.

– Builds a tree from the leaves up to the root.

345. A top-down parser traces or builds the parse tree in _________.

– preorder

346. What does it mean when a top-down parser traces the parse tree in preorder?

– Each node is visited before its branches are followed.

347. The actions taken by at top-down parser correspond to a _________ derivation.

– Leftmost

348. A _ parser is coded directly from the BNF description of the syntax language.

– recursive descent

349. What is an alternative to using a recursive-descent parser?

– Using a parsing table

350. Both parsing tables and recursive-descent parsers are _ ____.

– LL algorithms.

351. What does LL stand for in LL algorithm?

– First L: specifies a left-to-right scan of the input
– Second L: specifies that a leftmost derivation is generated

352. A given right sentential form may include more than one RHS from the grammar. The correct RHS to reduce is called the _________.

– handle

353. The most common bottom-up parsing algorithms are in the ___ family.

– LR

354. What does LR stand for in the LR algorithm?

– L: left-to-right scan
– R: rightmost derivation is generated

355. What is the Big O for parsing algorithms that work for any grammar?

– O(n³)

356. What is the Big O for parsing algorithms that are used in commercial applications?

– O(n)

357. How does a recursive-descent parser produce a parse tree?

– In top-down order

358. A recursive-descent parser has one subprogram for each ___________ in the grammar.

– nonterminal

359. _____ is ideally suited for describing recursive-descent parsers.

– EBNF

360. ____ causes a catastrophic problem for LL parsers.

– Left recursion

361. The left recursion in the rule A → A + B is called ___ _ _____.

– direct left recursion

362. True or False? Direct Left Recursion occurs in one rule.

– True

363. What does the ε symbol represent?

– empty string

364. A rule that has ε as its RHS is called an .

– erasure rule

365. Why is the erasure rule called that?

– Because using it in a derivation effectively erases its LHS from the sentential form.

366. The _ _ ______ is used to test a non-left-recursive grammar to determine whether it can be parsed in a top-down fashion.

– pairwise disjointness test

367. What does the pairwise disjointness test compute?

– FIRST sets

368. Is left-recursion acceptable for bottom-up parsers?

– Yes

369. A bottom-up parser produces the _____ of a rightmost derivation by starting with the last sentential form (input sentence) and working back to the start symbol.

– reverse

370. In each step of the bottom-up parser, the parser’s task is to find the ____ in the current sentential form that must be rewritten to get the previous sentential form.

– RHS

371. True or False? A right sentential form may include more than one RHS.

– True. i.e. The right sentential form E + T * id includes three RHSs, E + T, T, and id.

372. What is the task of a bottom-up parser for a given right sentential form?

– to find the unique handle

373. What is a phrase?

– A string consisting of all the leaves of the partial parse tree that is rooted at one particular internal node of the whole parse tree.

374. What is a simple phrase?

– A phrase that is derived from a nonterminal in a single step.

375. The handle of a right sentential form is the phrase.

– leftmost simple

376. Bottom up parsers are often called __ algorithms.

– shift reduce

377. Why are bottom up parsers often called shift-reduce algorithms?

– Because shift and reduce are their two fundamental actions

378. What does the shift action do in a shift-reduce algorithm?

– Moves the next input token onto the parser’s stack.

379. What does the reduce action do in a shift-reduce algorithm?

– Replaces the RHS (the handle) on top of the parser’s stack by its corresponding LHS.

380. Most bottom-up parsers belong to the ___ family.

– LR

381. What do LR parsers use?

– A small amount of code and a parsing table

382. What was the name of the original LR algorithm that was produced by Donald Knuth?

– canonical LR algorithm

383. Why was the canonical LR algorithm not widely used?

– It required large amounts of computer time and memory to produce a parsing table

384. What are some of the advantages of LR parsers?

– Can be built for all programming languages
– Can detect syntax errors as soon as possible
– LR class of grammars is a proper superset of the class parsable by LL parsers.

385. True or False? It’s easy to produce an LR parsing table by hand.

– False

386. How are LR parsing tables normally produced?

– Using software that takes a grammar as input and produces the parsing table

387. True or False? The LR parser can avoid examining the entire stack if it keeps a summary of the stack contents in a “state” symbol on top of the stack.

– True

388. What does the dollar sign represent in the LR parser configuration?

– End-of-input marker

389. The LR parsing process is based on the parsing table, which has two parts, _ and _.

– ACTION and GOTO

390. What does the parse table specify?

– What the parser should do

391. How does the parse table specify what the parser should do?

– Based on the state symbol on top of the parse stack and the next input symbol

392. Besides shift and reduce, what are the two other actions possible in a parse table and what do they do?

– Accept (parsing is complete) and error (a syntax error has been detected)

393. What do the values in the GOTO part indicate?

– Which state symbol should be pushed onto the parse stack after a reduction has been completed.

394. Describe the shift parser action.

– The next input symbol is pushed onto the stack along with the state symbol specified in the ACTION table.

395. Describe the reduce parser action.

– First, the handle is removed from the stack.
– Next, the number of symbols removed is twice the number of symbols in the handle.
– Next, the LHS of the rule is pushed onto the stack.
– Finally, the GOTO table is used to determine which state must be pushed onto the stack.

396. Describe the accept parser action.

– The parse is complete and no errors were found

397. Describe the error parser action.

– The parser calls an error-handling routine.

398. In imperative programming languages, what is a variable?

– A variable is an abstraction of the von Neumann architecture’s memory cells.

399. How can a variable be characterized?

– By a collection of attributes: Name, Address, Value, Type, Scope and Lifetime

400. True or False? In functional programming languages, once a name is created it can be changed.