公共表达式消除 - Common Subexpression Elimination

表达式树的应用!

Posted by isheng on July 26, 2016

表达式树问题

表达式树,将表达式存储在树结构中

原题题目

Description

Let the set Σ consist of all words composed of 1-4 lower case letters, such as the words “a”, “b”, “f”, “aa”, “fun” and “kvqf”. Consider expressions according to the grammar with the two rules
E → f
E → f(E, E)
for every symbol f ∈ Σ. Any expression can easily be represented as a tree according to its syntax. For example, the expression “a(b(f(a,a),b(f(a,a),f)),f(b(f(a,a),b(f(a,a),f)),f))” is represented by the tree on the left in the following figure:
Last night you dreamt of a great invention which considerably reduces the size of the representation:
use a graph instead of a tree, to share common subexpressions. For example, the expression above can be represented by the graph on the right in the figure. While the tree contains 21 nodes, the graph just contains 7 nodes.
Since the tree on the left in the figure is also a graph, the representation using graphs is not necessarily unique. Given an expression, find a graph representing the expression with as few nodes as possible!
树图

Input

The first line of the input contains the number c (1 ≤ c ≤ 200), the number of expressions. Each of the following c lines contains an expression according to the given syntax, without any whitespace. Its tree representation contains at most 50 000 nodes.

Output

For each expression, print a single line containing a graph representation with as few nodes as possible.
The graph representation is written down as a string by replacing the appropriate subexpressions with numbers. Each number points to the root node of the subexpression which should be inserted at that position. Nodes are numbered sequentially, starting with 1; this numbering includes just the nodes of the graph (not those which have been replaced by numbers). Numbers must point to nodes written down before (no forward pointers). For our example, we obtain ‘a(b(f(a,4),b(3,f)),f(2,6))’.

Sample Input

3
this(is(a,tiny),tree)
a(b(f(a,a),b(f(a,a),f)),f(b(f(a,a),b(f(a,a),f)),f))
z(zz(zzzz(zz,z),zzzz(zz,z)),zzzz(zz(zzzz(zz,z),zzzz(zz,z)),z))

Sample Output

this(is(a,tiny),tree)
a(b(f(a,4),b(3,f)),f(2,6))
z(zz(zzzz(zz,z),3),zzzz(2,5))

我的思路

这个题目在那本紫书上有解题思路,看了一下,用到了表达式树的知识,关键在于这个题目需要不断的比较子树是否相同,相同则消去,如果按照递归便利来消除将会造成算法复杂度高达明显就是会超时,我们可以建立一个快速匹配子树的方法降低程序的复杂度。建立方法如下:
1.将每个节点编号,如题目中根节点的编号为(a,b,f);
2.按照前序遍历给子节点编序列并存到新的树中,如根节点为,它的左节点为…如果出现相同的子节点则消去此节点不将它存储到新的树中,如下图所示
建树图
此题关键在于检索已经出现过得子树,用这种简单值匹配的方法可以将算法复杂度降低到,这样就不会超时了
3.对于建立的新的树按照中序遍历的方法输出

调试过程

最重要的是给节点遍序时,最好是给节点分配一个不会重复值来替代树的遍历查询,可以用hash值来确定一个节点是否重复或者直接比对其字符串。

改进

只知道按照书上的思路解得,有其他方法可以留言给我。

AC代码

#include<cstdio>
#include <string>
#include <map>
using namespace std;
const int MAX_a = 100000;
char s[MAX_a*5],*NextStr;
int sum,n,m,kase,cnt,End[MAX_a];

struct Node {//结构存储,hash用来计算节点的唯一值,用于快速比对
    string s;
    int hash,left,right;
    bool operator < (const Node& u) const {//重载“<”号,用于比对根及其子节点节点是否相同
        if(hash != u.hash) return hash < u.hash;//如果根节点不同,则返回其真值
        if(left != u.left) return left < u.left;//左节点不同,则返回其真值
        return right < u.right;//右节点不同,则返回其比较真值
    }
}Nodes[MAX_a];
map<Node,int> dict;//建立节点于序号关联
int BuildTree() {//
    int id = cnt++;
    Node& u = Nodes[id];
    u.left = u.right = -1 ;
    u.s = "";
    u.hash = 0;
    while(isalpha(*NextStr)) {//isalpha判断是否为字母的函数
        u.hash = u.hash * 27 + *NextStr - 'a' + 1;//这一段hash计算一个节点的唯一值,这个其他地方来的。这里好像可以不用这个,用其他方法也能检验此节点是否重复。
        u.s.push_back(*NextStr);
        ++NextStr;
    }
    if(*NextStr == '(') {//如果为左括号则递归
        NextStr++; u.left = BuildTree(); NextStr++; u.right = BuildTree(); NextStr++;
    }
    if(dict.count(u)) {
        id--; cnt--;
        return dict[u];//建立序号与dict的关联
    }
    return dict[u] = id;
}
void print(int v) {//中序遍历节点输出
    if(End[v] == kase) printf("%d",v+1);
    else {
        End[v] = kase;
        printf("%s",Nodes[v].s.c_str());
        if(Nodes[v].left != -1) {
            printf("(");
            print(Nodes[v].left);
            printf(",");
            print(Nodes[v].right);
            printf(")");
        }
    }
}
int main() {
    scanf("%d",&sum);
    for(kase = 1; kase <= sum; ++kase) {
        dict.clear();
        cnt = 0;
        scanf("%s",s);
        NextStr = s;
        print(BuildTree());
        printf("\n");
    }
    return 0;
}

Coding是一件有趣的事! 我要通过PAT! —

明白就行,无所谓竞赛,无所谓其他,只是为了学到东西,不在乎这些!