Structure
Modules¶
A module is defined by a single flowspec file. A module can contain several sections, for defining control-flow, data flow, types, and functions. Modules can import other modules.
module $module-id
imports
$module-ref*
$section*
Terms and Patterns¶
FlowSpec defines various data types, including terms, tuples, sets, and maps. These can be constructed by the user, or introduced by matching on the AST.
term = ctor-id "(" {term ","}* ")"
| "(" {term ","}* ")"
| "{" {term ","}* "}"
| "{" term "|" {term ","}* "}"
| "{" {(term "|->" term) ","}* "}"
| "{" term "|->" term "|" {(term "|->" term) ","}* "}"
Examples of these terms can be found in the Expressions subsection.
Control flow and data flow rules can use patterns to define which rules apply to which AST nodes.
pattern = ctor-id "(" {pattern ","}* ")"
| "(" {pattern ","}* ")"
| var-id "@" pattern
| "_"
| var-id
Example. The following shows an example of a pattern, matching a VarDec
constructor with an Int
child, binding some of the subterms.
VarDec(n, _, num@Int(i))
Control Flow¶
The control-flow section contains the rules that define the control-flow for the subject language.
control-flow rules
$control-flow-rule*
Control Flow Rules¶
A control-flow rule consists of a pattern and a corresponding list of control-flow chains.
control-flow-rule = "root"? pattern "=" {cfg-chain ","}+
| "node" pattern
cfg-chain = {cfg-chain-elem "->"}+
cfg-chain-elem = "entry"
| "exit"
| variable
| "node" variable
| "this"
Example. Module that specifies how the control-flow for the Add
AST node goes from the lhs, the rhs, and then to the Add
itself. It also specifies that Int
must have a node in the control-flow graph.
module control
control-flow rules
node Int(_)
Add(l, r) = entry -> l -> r -> this -> exit
Root Rules¶
A root of the control-flow defines the start and end nodes of a control-flow graph. You can have multiple control-flow graphs in the same AST, but not nested ones. Each control-flow graph has a unique start and end node. A root control-flow rule introduces the start and end node. In other control-flow rules these nodes can be referred to for abrupt termination.
cfg-chain-elem = ...
| "start"
| "end"
Example. Module that defines control-flow for a procedure, and the return statement that goes straight to the end of the procedure.
module control
control-flow rules
root Procedure(args, _, body) = start -> args -> body -> end
Return(_) = entry -> this -> end
Data Flow¶
Properties¶
The data flow section contains definitions of the properties to compute, and the rules that define how these properties are computed.
properties
property-definition*
A property has a name, and a corresponding lattice type. The result after analysis will be a lattice of this type for each node in the control-flow graph.
property-definition = name ":" lattice
Example. Lattice definition for a constant-value analysis.
properties
values: Map[name, Value]
Rules¶
The data flow rules specify how data should flow across the control-flow graph.
property rules
property-rule*
property-rule = name "(" prop-pattern ")" "=" expr
prop-pattern = name "->" pattern
| pattern "->" name
| pattern "." "start"
| pattern "." "end"
Example. A simple specification for a constant-value analysis.
property rules
values(_.end) = Map[string, Value].bottom
values(prev -> VarDec(n, _, Int(i))) = { k |-> v | (k |-> v) <- values(prev), k != n } \/ {n |-> Const(i)}
values(prev -> VarDec(n, _, _)) = { k |-> v | (k |-> v) <- values(prev), k != n } \/ {n |-> Top()}
values(prev -> _) = values(prev)
Types¶
Algebraic datatypes can be defined for use within lattices definitions. Users can directly match these datatypes, or construct new values.
types
type-definition*
An algebraic datatype consists of a constructor and zero or more arguments.
name =
("|" ctor-id "(" {type ","}* ")")+
Example. The definition for an algebraic type ConstProp
used in constant value analysis.
types
ConstProp =
| Top()
| Const(int)
| Bottom()
Lattices¶
Lattices are the main data type used in data-flow analysis, because of their desirable properties. Properties (the analysis results) must always be of type lattice. FlowSpec contains some builtin lattice types, but users can also specify their own.
lattices
lattice-definition*
Lattice definitions must include the following: the underlying datatype, a join operator (either least-upper bound or greatest-lower bound), a top, and a bottom.
name where
type = type
lub([name], name) = expr
top = expr
bottom = expr
Example. A lattice definition using the ConstProp
above to define a Value
type.
lattices
Value where
type = ConstProp
bottom = Bottom()
top = Top()
lub(l, r) = match (l, r) with
| (Top(), _) => Top()
| (_, Top()) => Top()
| (Const(i), Const(j)) => if i == j then Const(i) else Top()
| (_, Bottom()) => l
| (Bottom(), _) => r
Functions¶
Functions make it possible to reuse functionality and avoid duplication of logic.
functions
function-definition*
name([{(name ":" type) ","}+]) =
expr
Expressions¶
Integers¶
Integer literals are written with an optional minus sign followed by one or more decimals.
Supported integer operations are:
- Addition [
+
] - Subtraction [
-
] - Multiplication [
*
] - Division [
/
] - Modulo [
%
] - Negate [
-
] - Comparison [
<
,<=
,>
,>=
,==
,!=
]
Booleans¶
Boolean literals true and false are available as well as the usual boolean operations:
- And [
&&
] - Or [
||
] - Not [
!
]
Sets and Maps¶
Set and map literals are both denoted with curly braces. A set literal contains a comma-separated list of elements: {elem1, elem2, elem3}
. A map literal contains a comma-separated list of bindings of the form key |-> value: { key1 |-> value1, key2 |-> value2 }
.
Operations on sets and maps include
- Union [
\/
] - Intersection [
/\
] - Set/map minus [
\
] - Containment/lookup [
in
]
There are also comprehensions of the form { new | old <- set, conditions }
or { newkey |-> newvalue | oldkey |-> oldvalue <- map, condition }
, where new elements or bindings are gathered based on old ones from a set or map, as long as the boolean condition expressions hold. Such a condition expression may also be a match expression without a body for the arms. This is commonly used to filter maps or sets.
Example. The following are some examples of sets and maps.
// A map comprehension filtering the key n
{ k |-> v | (k |-> v) <- values(prev), k != n }
// A map literal
{n |-> Top()}
// A set comprehension filtering the value n
{ k | k <- live(prev), k != n }
// A set literal
{ n, "b", "foo" }
Match¶
Pattern matching can be done with a match expression: match expr with | pattern1 => expr2 | pattern2 => expr2
, where expr are expressions and pattern are patterns. Terms and patterns are defined at the start of the reference.
Variables and References¶
Pattern matching can introduce variables. Other references include values in the lattice, such as MaySet.bottom
or MustSet.top
.
Functions and Lattice Operations¶
User defined functions are invoked with functionname(arg1, arg2)
. Lattice operations can be similarly invoked, requiring the type name: MaySet.lub(s1, s2)
.
Property Lookup¶
Property lookup is similar to a function call, although property lookup only ever has a single argument.
Example. The following property rule performs a set comprehension over the results of a property lookup, live(prev)
, where the property live
has been declared in the properties
section, and next
is bound in the pattern.
live(VarDec(n, _, _) -> next) = { k | k <- live(next), k != n }
Term Positions¶
FlowSpec provides a builtin function that returns the position of a term: position(term)
. This can be used to differentiate two terms from an AST that are otherwise equal.
Lexical Grammar¶
Identifiers¶
Most identifiers in FlowSpec fall into one of two categories, which we will refer to as:
Lowercase identifiers, that start with a lowercase character, and must match the regular expression [a-z][a-zA-Z0-9]*.
Uppercase identifiers, that start with an uppercase character, and must match the regular expression [A-Z][a-zA-Z0-9]*.
Comments¶
Comments in FlowSpec follow C-style comments:
// ... single line ... for single-line comments
/* ... multiple lines ... */ for multi-line comments
Multi-line comments can be nested, and run until the end of the file when the closing */ is omitted.
Created: October 17, 2024