KMP小结

印度阿三17 2019-09-08

展开全文

模板题
\(KMP\)是个其实不是很难的算法吧...虽然我表面上学了好久,但其实只有一下午是在学,一下午就完事儿了.
\(KMP\)应该是目前最优秀的单串匹配算法了吧.它的复杂度是\(O(n m)\).其中,\(n\)为匹配串(即长串)的长度,\(m\)为模式串(即短串)的长度.
看起来确实很猛,我们来康康它是怎么如此优秀地完成工作的.
众所周知,\(KMP\)的核心在于一个\(next\)数组.
我们对它的定义可以是:对于\(1~i\)的字符,前缀等于后缀的最长长度.
最长的前缀等于后缀嘛就是.这东西有啥用?
我们来想一想,如果我们在\(i\)这个位置失配,那么根据定义,我们可以知道如果我们跳到\(next_i\)那么不会使得现在已经匹配的位数变少.
反而可能使得匹配的位数更多,所以我们每次失配的时候要跳向\(next_i\),这一定不会使得情况更劣.
然后我们就这样只要失配就一直跳\(next_i\),可以证明,如果存在一个能匹配的子串,我们一定不会漏掉它.
好了,\(next\)是啥和怎么在匹配中使用说完了,我们考虑怎么样去构造这个\(next\)数组.
如果我直接告诉你让它自己和自己匹配你肯定一脸懵逼.
所以我们考虑这玩意儿的定义:前缀等于后缀的最长长度.
你考虑模式串在自己对自己匹配的时候,实际就是自己的一个前缀在和一个后缀匹配.
而最多能匹配多少位就是我们所需要的\(next\)数组.于是我们得到了怎么构造\(next\)数组.
但是直接暴力匹配还是会导致复杂度退化,怎么办呢?
你想,你的\(next\)是从前向后构造的,也就是说你在构造到第\(i\)位的时候,\(1~i-1\)的\(next\)已经构造出来了.
那么我们就可以直接利用之前已经求出来的\(next\)数组去协助我们构造,这样,我们就能达到理想的复杂度.
\(Code:\)

// 这是上面那道模板题的代码.
#include <algorithm>
#include <iostream>
#include <cstdlib>
#include <cstring>
#include <cstdio>
#include <string>
#include <vector>
#include <queue>
#include <cmath>
#include <ctime>
#include <map>
#include <set>
#define MEM(x,y) memset ( x , y , sizeof ( x ) )
#define rep(i,a,b) for (int i = a ; i <= b ;    i)
#define per(i,a,b) for (int i = a ; i >= b ; -- i)
#define pii pair < int , int >
#define X first
#define Y second
#define rint read<int>
#define int long long
#define pb push_back

using std::set ;
using std::pair ;
using std::max ;
using std::min ;
using std::priority_queue ;
using std::vector ;
using std::swap ;
using std::sort ;
using std::unique ;
using std::greater ;

template < class T >
    inline T read () {
        T x = 0 , f = 1 ; char ch = getchar () ;
        while ( ch < '0' || ch > '9' ) {
            if ( ch == '-' ) f = - 1 ;
            ch = getchar () ;
        }
       while ( ch >= '0' && ch <= '9' ) {
            x = ( x << 3 )   ( x << 1 )   ( ch - 48 ) ;
            ch = getchar () ;
       }
   return f * x ;
}

template < class T >
    inline void write (T x) {
       static T stk[100] , top = 0 ;
       if ( x == 0 ) { putchar ('0') ; return ; }
       if ( x < 0 ) { x = - x ; putchar ( '-' ) ; }
       while ( x ) { stk[  top] = x % 10 ; x /= 10 ; }
       while ( top ) { putchar ( stk[top--]   '0' ) ; }
       putchar ('\n') ;
    }

const int N = 1e6   100 ;

char s[N] , t[N] ;
int nxt[N] ;

signed main (int argc , char * argv[]) {
    scanf ("%s%s" , s   1 , t   1 ) ; int j = 0 ;
    int n = strlen ( s   1 ) , m = strlen ( t   1 ) ;
    rep ( i , 2 , m ) {
        while ( j && t[i] != t[j 1] ) j = nxt[j] ;
        if ( t[i] == t[j 1] )    j ; nxt[i] = j ;
    }
    j = 0 ;
    rep ( i , 1 , n ) {
        while ( j && s[i] != t[j 1] ) j = nxt[j] ;
        if ( s[i] == t[j 1] )    j ;
        if ( j == m ) write ( i - m   1 ) ;
    }
    rep ( i , 1 , m ) printf ("%lld " , nxt[i] ) ;
    system ("pause") ; return 0 ;
}

来源：https://www./content-4-443801.html