Adamátor
Zápisky
Hlášky
Finite Transducers
Lectures given by
Prof. Jacques Sakarovitch
.
Convention
We will denote an alphabet with
A
, the free monoid (the set of words) over
A
by
A
*
and the empty word by
1
A
*
.
Definition
A
finite automaton
on
A
is a directed graph with edges labelled by the letters from
A
. It is described by the 5-tuple
𝒜
=
⟨
A
,
Q
,
I
,
E
,
T
⟩
, where
Q
is the set of
states
(vertices),
I
⊂
Q
is the set of
initial states
,
E
⊂
Q
×
A
×
Q
is the set of
transitions
and
T
⊂
Q
is the set of
final states
. The
accepted language
is the set
L
(
𝒜
)
≔
{
w
∈
A
*
|
∃
i
∈
I
,
t
∈
T
:
i
→
𝒜
w
t
}
.
Note
Let
A
*
,
B
*
be two free monoids. Then
A
*
×
B
*
is also a monoid, but not free (for example, given
a
∈
A
and
b
∈
B
, the elements
(
a
,
1
A
*
)
and
(
1
B
*
,
b
)
commute).
Definition
A
finite transducer
on
A
is a directed graph with edges labelled by pairs from
A
*
×
B
*
. It is described by the 5-tuple
𝒯
=
⟨
A
*
×
B
*
,
Q
,
I
,
E
,
T
⟩
, where
Q
is the set of
states
(vertices),
I
⊂
Q
is the set of
initial states
,
E
⊂
Q
×
A
*
×
B
*
×
Q
is the set of
transitions
and
T
⊂
Q
is the set of
final states
. We define
|
𝒯
|
≔
{
(
u
,
v
)
|
∃
i
∈
I
,
t
∈
T
:
i
→
𝒯
(
u
,
v
)
T
}
.
Definition
A
relation
θ
from
A
*
to
B
*
is defined by its graph
θ
^
⊂
A
*
×
B
*
and denoted by
θ
:
A
*
→
B
*
, with
θ
(
u
)
≔
{
v
∈
B
*
|
(
u
,
v
)
∈
θ
^
}
.
Its
inverse
is
θ
−
1
:
B
*
→
A
*
with the same graph and
θ
−
1
(
v
)
≔
{
u
∈
A
*
|
(
u
,
v
)
∈
θ
^
}
.
Definition
additivity
Let
L
⊂
A
*
. Then
θ
(
L
)
≔
⋃
u
∈
L
θ
(
u
)
.
Definition
The
complement
of a relation
θ
is the relation
∁
θ
such that
∁
θ
^
=
∁
θ
^
=
(
A
*
×
B
*
)
∖
θ
^
.
Definition
Let
𝒯
be a finite transducer. Then
|
𝒯
|
is its
realised relation
.
Note
The domain and range of
|
𝒯
|
are regular languages and the relation
|
𝒯
|
−
1
is also realised by a transducer (with the labels swapped).
Note
C
≔
(
A
×
{
1
}
)
∪
(
{
1
}
×
B
)
is a generating set of the monoid
A
*
×
B
*
. So is
D
≔
(
A
×
{
1
}
)
∪
(
{
1
}
×
B
)
∪
(
A
×
B
)
.
Definition
A transducer is
normalised
if its labels are in
C
and
subnormalised
if its labels are in
D
.
Theorem
Every transducer is equivalent to a normalised transducer.
Proof
We replace the edge
(
u
,
v
)
with the sequence of edges
(
u
1
,
1
)
,
…
,
(
u
|
u
|
,
1
)
,
(
1
,
v
1
)
,
…
,
(
1
,
v
|
v
|
)
.
Definition
An automaton over a monoid
M
is
proper
if no edge is labelled by
1
M
(
spontaneous transition
).
Theorem
Every automaton is equivalent to a proper automaton.
Proof
Shown in
the main lecture
. It uses the
backward closure
and the
forward closure
.
Definition
Let
K
⊂
A
*
be a regular language. We define the relation
z
K
:
A
*
→
A
*
,
z
K
(
u
)
≔
{
{
u
}
if
u
∈
K
,
∅
otherwise.
Note
Obviously this relation is accepted by a finite transducer.
Note
We can extend the notion of a finite transducer to arbitrarily many free monoids.
Note
We can also have
right automata
, which read from right to left. We can model a right automaton as the transpose of a left automaton. However, this does not preserve certain properties (such as determinism), so it is often useful to treat them as a separate concept.
Definition
The
composition
of relations
θ
:
A
*
→
B
*
and
σ
:
B
*
→
C
*
is the relation
σ
∘
θ
:
A
*
→
C
*
defined as
(
σ
∘
θ
)
(
u
)
≔
σ
(
θ
(
u
)
)
, or equivalently
σ
∘
θ
^
≔
{
(
u
,
w
)
∈
A
*
×
C
*
|
∃
v
∈
B
*
:
(
u
,
v
)
∈
θ
^
∧
(
v
,
w
)
∈
σ
^
}
.
Definition
The
composition product
of two subnormalised transducers
𝒯
=
⟨
A
*
×
B
*
,
Q
,
I
,
E
,
T
⟩
,
𝒮
=
⟨
B
*
×
C
*
,
R
,
J
,
F
,
U
⟩
is the transducer
𝒯
⋈
𝒮
=
⟨
A
*
×
C
*
,
Q
×
R
,
I
×
J
,
G
,
T
×
U
⟩
with
G
≔
G
1
∪
G
2
∪
G
3
,
G
1
≔
{
(
p
,
r
)
→
(
x
,
y
)
(
q
,
s
)
|
x
∈
A
∪
{
1
}
,
y
∈
C
∪
{
1
}
,
p
,
q
∈
Q
,
r
,
s
∈
R
|
∃
b
∈
B
:
p
→
(
a
,
b
)
q
∈
E
∧
r
→
(
b
,
y
)
s
∈
F
}
,
G
2
≔
{
(
p
,
r
)
→
(
a
,
1
)
(
q
,
r
)
|
a
∈
A
,
p
,
q
∈
Q
,
r
∈
R
|
p
→
(
a
,
1
)
q
∈
E
}
,
G
3
≔
{
(
p
,
r
)
→
(
1
,
c
)
(
p
,
s
)
|
c
∈
C
,
p
∈
Q
,
r
,
s
∈
R
|
r
→
(
1
,
c
)
s
∈
E
}
.
Note
The composition product might result in spontaneous transitions, which can be eliminated with backward closure.
Theorem
Elgot–Megei
The relation realised by the composition product of two transducers is the composition of their relations.
Note
Let
M
be a monoid. Then
𝒫
(
M
)
with the operations
∪
,
⋅
is a semiring.
Definition
Let
M
be a monoid and
P
⊂
M
. Then we define
P
0
≔
{
1
M
}
,
P
n
+
1
≔
P
n
⋅
P
,
P
*
≔
⋃
n
∈
ℕ
0
P
n
.
Lemma
Let
M
be a monoid and
P
⊂
M
. Then
P
*
=
⟨
P
⟩
.
Definition
A
rational expression
over a monoid
M
is an expression built inductively as:
0
,
1
and
m
∈
M
are atomic expressions,
if
E
,
F
are expressions, then
E
+
F
,
E
⋅
F
and
E
*
are expressions.
The subset
denoted
by a rational expression is
|
0
|
=
∅
,
|
1
|
=
{
1
M
}
,
|
m
|
=
{
m
}
,
|
E
+
F
|
=
|
E
|
∪
|
F
|
,
|
E
⋅
F
|
=
|
E
|
|
F
|
,
|
E
*
|
=
|
E
|
*
.
A subset
P
⊂
M
is
rational
if it is denoted by a rational expression. The set of all rational subsets is denoted
Rat
M
.
Note
The name “rational” comes from the fact that
X
*
=
1
+
X
X
*
, which could be rewritten as
X
*
(
1
−
X
)
=
(
1
−
X
)
X
*
=
1
.
So although
1
−
X
does not actually mean anything, we can imagine it as being the multiplicative inverse of
X
*
. This is analogous to the geometric series identity
1
1
−
x
=
∑
n
=
0
∞
x
n
.
Theorem
A subset of a monoid
M
is rational if and only if it is accepted by a finite automaton over
M
.
Proof
Analogous to the one for regular languages.
(⇒)
Operations on standard automata.
(⇐)
State elimination method.
Note
Unlike regular languages, rational sets are not closed under intersection. For example, let
|
𝒱
1
|
≔
{
(
𝖺
n
𝖻
m
,
𝖼
n
)
|
n
,
m
∈
ℕ
}
,
|
𝒱
2
|
≔
{
(
𝖺
n
𝖻
m
,
𝖼
m
)
|
n
,
m
∈
ℕ
}
.
Both of these can be recognised by a finite transducer, but their intersection is
|
𝒱
1
|
∩
|
𝒱
2
|
=
{
(
𝖺
n
𝖻
n
,
𝖼
n
)
|
n
∈
ℕ
}
,
which is clearly not rational.
Corollary
Rational sets are not closed under complement.
Theorem
The complement if the identity is a rational expression.
Corollary
Any function
θ
is a rational expression.
Proof
∁
θ
^
=
∁
Dom
θ
×
B
*
∪
χ
B
*
∘
θ
^
.
Theorem
Rabin & Scott, 1959
Given
R
,
S
∈
Rat
A
*
×
B
*
with
|
A
|
,
|
B
|
≥
2
, it is undecidable whether
R
∩
S
=
∅
.
Theorem
Fischer & Rosenberg, 1968
If
|
A
|
,
|
B
|
≥
2
, it is undecidable whether two transducers over
A
*
×
B
*
are equivalent.
Theorem
Gibbons & Ryther, 1986
Given
R
,
S
∈
Rat
{
𝖺
,
𝖻
}
*
×
{
𝖼
}
, it is decidable whether
R
∩
S
=
∅
.
Theorem
Ibarra, 1978; Liskovik, 1979
It is undecidable whether two transducers over
{
𝖺
,
𝖻
}
*
×
{
𝖼
}
are equivalent.
Problem
Post correspondence problem
Let
B
be a set with at least
2
elements and
U
=
{
u
1
,
…
,
u
k
}
,
V
=
{
v
1
,
…
,
v
k
}
⊂
B
*
. Does there exist a sequence of indices
i
1
,
…
,
i
p
such that
u
i
1
⋯
u
i
p
=
v
i
1
⋯
v
i
p
?
Theorem
The Post correspondence problem is recursively undecidable.
Note
Take the sets
U
,
V
from the Post correspondence problem and define morphisms
τ
U
,
τ
V
:
[
k
]
*
→
B
*
as
τ
U
(
i
)
≔
u
i
,
τ
V
(
i
)
≔
v
i
. Then the Post correspondence problem can be reformulated as deciding whether
∃
w
∈
[
k
]
*
:
τ
U
(
w
)
=
τ
V
(
w
)
, or equivalently,
τ
^
U
∩
τ
^
V
≠
∅
.
Note
Let
A
=
{
𝖺
,
𝖻
}
. Define the injective morphism
κ
:
[
k
]
*
→
A
*
,
κ
(
i
)
≔
𝖺
i
𝖻
. Then
τ
U
∘
κ
−
1
^
∩
τ
V
∘
κ
−
1
^
=
∅
⟺
τ
^
U
∩
τ
^
V
=
∅
.
This somehow implies that the equivalence of rational sets is undecidable.
Theorem
Let
R
⊂
Rat
A
*
×
B
*
with
|
A
|
,
|
B
|
≥
2
. Then it is undecidable whether
R
=
A
*
×
B
*
.
Proof
∁
(
τ
U
∘
κ
−
1
)
^
∩
∁
(
τ
V
∘
κ
−
1
)
^
=
A
*
×
B
*
⟺
τ
^
U
∩
τ
^
V
=
∅
.
Note
Let
A
∩
B
=
∅
and
Z
≔
A
∪
B
. Define the projections
π
A
:
Z
*
→
A
*
,
π
B
:
Z
*
→
B
*
the obvious way. Then for a given
θ
:
A
*
→
B
*
we have
θ
∈
Rat
A
×
×
B
*
⟺
∃
K
∈
Rat
Z
*
:
θ
=
π
B
∘
Z
K
∘
π
A
−
1
.
Definition
Let
𝒜
=
⟨
Q
,
I
,
E
,
T
⟩
,
ℬ
=
⟨
R
,
J
,
F
,
U
⟩
be two finite automata. A
graph-morphism
is a function
φ
:
Q
→
R
such that
φ
(
I
)
⊂
J
,
φ
(
T
)
⊂
U
,
∀
(
p
,
a
,
q
)
∈
E
:
(
φ
(
p
)
,
a
,
φ
(
q
)
)
∈
F
.
We denote
φ
:
𝒜
↬
ℬ
.
Theorem
If
φ
:
𝒜
↬
ℬ
, then
|
𝒜
|
⊂
|
ℬ
|
.
Theorem
The projection from
𝒜
×
ℬ
to
𝒜
is a graph-morphism.
Theorem
If
φ
:
𝒜
↬
ℬ
, then also
φ
:
𝒜
𝖳
↬
ℬ
𝖳
.
Definition
A graph-morphism
φ
:
𝒜
↬
ℬ
is
conformal
if every computation in
ℬ
is the image of a computation in
𝒜
.
Definition
Given an automaton
𝒜
, denote by
𝒜
n
the normalised version of
𝒜
with two new
subliminal states
i
,
t
. A graph-morphism
φ
:
𝒜
↬
ℬ
can be naturally extended to a morphism
φ
n
:
𝒜
n
↬
ℬ
n
.
Definition
The
outgoing bouquet
of a state
p
in an automaton
𝒜
is
Out
𝒜
(
p
)
≔
{
e
∈
E
|
e
=
(
p
,
a
,
q
)
}
.
The
ingoing bouquet
of a state
q
in an automaton
𝒜
is
In
𝒜
(
q
)
≔
{
e
∈
E
|
e
=
(
p
,
a
,
q
)
}
.
Definition
A graph-morphism
φ
:
𝒜
↬
ℬ
is
locally outsurjective
/
outbijective
if for all
p
∈
𝒜
n
, the morphism
φ
n
:
Out
𝒜
n
(
p
)
→
Out
ℬ
n
(
φ
(
p
)
)
is surjective/bijective.
Definition
Let
𝒜
=
⟨
Q
,
I
,
E
,
T
⟩
,
ℬ
=
⟨
R
,
J
,
F
,
U
⟩
be two finite automata. A
morphism
from
𝒜
to
ℬ
is a graph-morphism
φ
:
𝒜
↬
ℬ
which is surjective and locally outsurjective. If there exists a morphism from
𝒜
to
ℬ
,
ℬ
is a
quotient
of
𝒜
.
Theorem
If
ℬ
is a quotient of
𝒜
, then
|
𝒜
|
=
|
ℬ
|
.
Definition
A graph-morphism
φ
:
𝒜
↬
ℬ
is
locally insurjective
/
inbijective
if for all
p
∈
𝒜
n
, the morphism
φ
n
:
In
𝒜
n
(
p
)
→
In
ℬ
n
(
φ
(
p
)
)
is surjective/bijective.
Definition
Let
𝒜
=
⟨
Q
,
I
,
E
,
T
⟩
,
ℬ
=
⟨
R
,
J
,
F
,
U
⟩
be two finite automata. A
comorphism
from
𝒜
to
ℬ
is a graph-morphism
φ
:
𝒜
↬
ℬ
which is surjective and locally insurjective. If there exists a comorphism from
𝒜
to
ℬ
,
ℬ
is a
coquotient
of
𝒜
.
Definition
Let
𝒜
=
⟨
Q
,
I
,
E
,
T
⟩
,
ℬ
=
⟨
R
,
J
,
F
,
U
⟩
be two finite automata. A
covering
from
𝒜
to
ℬ
is a graph-morphism
φ
:
𝒜
↬
ℬ
which is surjective and locally outbijective.
Definition
Let
𝒜
=
⟨
Q
,
I
,
E
,
T
⟩
,
ℬ
=
⟨
R
,
J
,
F
,
U
⟩
be two finite automata. A
cocovering
from
𝒜
to
ℬ
is a graph-morphism
φ
:
𝒜
↬
ℬ
which is surjective and locally inbijective.
Definition
An automaton
𝒜
is
unambiguous
if every accepted word has only one computation.
Theorem
If
φ
:
𝒜
→
ℬ
is a morphism, then there exists a subautomaton
𝒞
of
𝒜
such that
φ
:
𝒞
→
ℬ
is a covering.
Definition
A state
q
in a subnormalised transducer
𝒯
is
stalled
/
tape-consistent
if, if an ingoing transition is labeled in
{
1
}
×
A
(resp.
A
×
{
1
}
), then all outgoing transitions are labeled in
{
1
}
×
A
(resp.
A
×
{
1
}
).
Definition
A subnormalised transducer is
synchronous
if every state is stalled. A relation
θ
:
A
*
→
B
*
is
synchronous
if it is realised by a synchronous transducer. The set of synchronous relations is denoted
Syn
A
*
×
B
*
.
Note
This corresponds to a Turing machine with two tapes where both tape heads only move forward and simultaneously.
Theorem
The inverse of a synchronous relation is a synchronous relation.
Theorem
The universal relation is synchronous.
Theorem
Syn
A
*
×
B
*
is an effective Boolean algebra.
Theorem
A synchronous transducer can be determinised.
Definition
A transducer
𝒯
is
homogeneous
if for every state
p
, either:
all ingoing transitions of
p
are in
A
′
≔
A
*
×
{
1
}
(
p
is of
type α
), or
all ingoing transitions of
p
are in
B
′
≔
{
1
}
×
B
*
(
p
is of
type β
), or
all ingoing transitions of
p
are in
C
≔
A
*
×
B
*
(
p
is of
type γ
).
Theorem
Every synchronous transducer has a homogeneous covering.
Definition
A transducer
𝒯
is
complete
if:
𝒯
is homogeneous,
for every state
p
of type
γ
and every
d
∈
D
≔
A
′
∪
B
′
∪
C
there exists at least one transition out of
p
labeled by
D
,
for every state
p
of type
α
(resp.
β
) and every
x
∈
A
′
(resp.
B
′
) there exists at least one transition out of
p
labeled by
x
,
Theorem
Every homogenous synchronous transducer can be made complete.
Theorem
It is recursively undecidable if a rational relation is synchronous.
Definition
The
matrix representation
of an automaton
𝒜
=
⟨
Q
,
I
,
E
,
T
⟩
is the morphism
μ
:
A
*
→
𝔹
Q
×
Q
defined as
μ
(
a
)
p
,
q
≔
[
p
→
a
q
]
.
We will also represent
I
as a row vector and
T
as a column vector. Then we can write
𝒜
=
⟨
I
,
μ
,
T
⟩
.
Lemma
For every
w
∈
A
*
and
p
,
q
,
μ
(
w
)
=
1
⟺
p
→
𝒜
w
q
.
Corollary
L
(
𝒜
)
=
{
w
∈
𝒜
*
|
I
⋅
μ
(
w
)
⋅
T
=
1
}
.
Definition
A language
L
⊂
A
*
is
recognisable
if there exists a finite monoid
M
, a morphism
α
:
A
*
→
M
and
P
⊂
M
such that
L
=
α
−
1
(
P
)
.
Theorem
A language is recognisable if and only if it is accepted by a finite automaton.
Proof
(⇐)
Let
𝒜
=
⟨
I
,
μ
,
T
⟩
be an automaton accepting
L
, represented as a matrix. Take
P
≔
{
m
∈
𝔹
Q
×
Q
|
I
⋅
m
⋅
T
=
1
}
.
Then
L
=
μ
−
1
(
P
)
.
(⇒)
Let
𝒜
≔
M
,
{
1
M
}
,
E
,
P
with transitions
E
≔
{
m
→
a
m
⋅
α
(
a
)
|
m
∈
M
,
a
∈
A
}
.
Note that this automaton is deterministic and for all
w
∈
𝒜
*
,
1
M
→
𝒜
w
α
(
w
)
.
Note
This construction could in theory be used to determinise an automaton, but it’s even worse than the subset construction because it may produce up to
2
|
Q
|
2
states.
Note
An automaton defined on a non-free monoid does not necessarily have a matrix representation because not every function defined on a generating set can be extended to a morphism.
Definition
A
finite real-time transducer
is a tuple
𝒯
=
⟨
A
*
×
B
*
,
Q
,
I
,
E
,
T
⟩
, where
Q
is a finite set,
E
⊂
Q
×
(
A
×
𝒫
(
B
*
)
)
×
Q
and
I
,
T
:
Q
→
𝒫
(
B
*
)
. Note that the subsets of
B
*
may be infinite!
Theorem
A relation
θ
:
A
*
→
B
*
is rational if and only if it is realised by a finite real-time transducer.
Theorem
Let
E
∈
𝒫
(
A
)
Q
×
Q
. Then the entries of
E
*
belong to the rational closure of the entries of
E
.
Theorem
If
μ
:
A
*
→
(
Rat
B
*
)
Q
×
Q
is a morphism, then
μ
(
Rat
A
*
)
⊂
(
Rat
B
*
)
Q
×
Q
.