Prerequisite knowledge:
f
f
T
FFT
FFT
Introduction
General NTT modulus
P
P
P must be satisfied
P
=
r
x
2
k
+
1
P=r\times 2^k + 1
P=r×2k+1. But if this condition is not met, NTT cannot be used directly, and any modulus NTT needs to be used.
You can learn about the three-modulus NTT.
Let’s introduce a kind of arbitrary modulus NTT – dismantling coefficient FFT.
Coefficient FFT
For polynomial
f
(
x
)
f(x)
Coefficient of f(x)
A
A
A, you can
A
A
A is divided into high
A
1
A_1
A1? and low
A
0
A_0
A0?, such as
A
=
A
1
x
2
15
+
A
0
A=A_1\times 2^{15} + A_0
A=A1?×215+A0?. Then, you can put
A
1
A_1
A1? as a coefficient to construct a polynomial
f
1
(
x
)
f_1(x)
f1?(x), will
A
0
A_0
A0? as a coefficient to construct a polynomial
f
0
(
x
)
f_0(x)
f0?(x). Similarly, the polynomial
g
(
x
)
g(x)
Coefficient of g(x)
B
B
B is divided into high
B
1
B_1
B1? and low
B
0
B_0
B0?, get
g
1
(
x
)
g_1(x)
g1?(x) and
g
0
(
x
)
g_0(x)
g0?(x).
So,
f
(
x
)
g
(
x
)
=
f
1
(
x
)
g
1
(
x
)
x
2
30
+
(
f
1
(
x
)
g
0
(
x
)
+
f
0
(
x
)
g
1
(
x
)
)
x
2
15
+
f
0
(
x
)
g
0
(
x
)
f(x)g(x)=f_1(x)g_1(x)\times 2^{30} + (f_1(x)g_0(x) + f_0(x)g_1(x))\times 2^{15} + f_0(x)g_0(x)
f(x)g(x)=f1?(x)g1?(x)×230 + (f1?(x)g0?(x) + f0?(x)g1?(x))×215 + f0?(x)g0?(x)
use
f
f
T
FFT
FFT is used for processing, and the forward transformation needs four times. Originally, the inverse transformation also needs four times, but because
f
1
(
x
)
g
0
(
x
)
f_1(x)g_0(x)
f1?(x)g0?(x) and
f
0
(
x
)
g
1
(
x
)
f_0(x)g_1(x)
f0?(x)g1?(x) are of the same order, and can be added and then inversely transformed, so only three inverse transformations are required.
After obtaining the three parts, when multiplying the order of each part,
Throughout the process, a total of
7
7
7 times
D.
f
T
DFT
DFT or
I
D.
f
T
IDFT
IDFT.
The time complexity is
o
(
no
log
?
no
)
O(n\log n)
O(nlogn).
Compared with the three-modulus NTT, the factorized FFT is easy to understand, easy to implement, and runs fast, but sometimes the accuracy may be problematic, and if the coefficients in the resulting polynomial are large and the long double cannot be stored, it cannot be used. Be careful when doing the questions.
Example
Luogu P4245 [Template] Arbitrary modulus polynomial multiplication
#include <bits/stdc++.h> using namespace std; const int N=500000; const long double pi=acos(-1.0); int n,m,l=1; long long p,f[N+5],g[N+5],w1[N+5],w2[N+5],w3[N+5],ans[N+5]; struct cp{<!-- --> long double a, b; cp operator + (const cp ax)const{<!-- --> return (cp){<!-- -->a + ax.a,b + ax.b}; } cp operator -(const cp ax)const{<!-- --> return (cp){<!-- -->a-ax.a,b-ax.b}; } cp operator *(const cp ax)const{<!-- --> return (cp){<!-- -->a*ax.a-b*ax.b,b*ax.a + a*ax.b}; } }w,wn,v1[N+5],v2[N+5],v3[N+5],v4[N+5],h1[N+5],h2[N+5],h3[N+5]; void ch(cp *a){<!-- --> for(int i=1,j=l/2,k;i<l-1;i ++ ){<!-- --> if(i<j) swap(a[i],a[j]); k=l/2; while(j>=k){<!-- --> j-=k;k>>=1; } j + =k; } } void fft(cp *a,int fl){<!-- --> ch(a); for(int i=2;i<=l;i<<=1){<!-- --> wn=(cp){<!-- -->cos(fl*2*pi/i),sin(fl*2*pi/i)}; for(int j=0;j<l;j + =i){<!-- --> w=(cp){<!-- -->1,0}; for(int k=j;k<j + i/2;k + + ,w=w*wn){<!-- --> cp t=a[k],u=w*a[k + i/2]; a[k]=t + u; a[k + i/2]=t-u; } } } } int main() {<!-- --> scanf("%d%d%lld", &n, &m, &p); for(int i=0;i<=n;i ++ ){<!-- --> scanf("%lld", &f[i]);f[i]%=p; } for(int i=0;i<=m;i ++ ){<!-- --> scanf("%lld", &g[i]);g[i]%=p; } while(l<n + m + 1) l<<=1; for(int i=0;i<l;i ++ ){<!-- --> v1[i].a=f[i]>>15; v2[i].a=f[i] &((1<<15)-1); v3[i].a=g[i]>>15; v4[i].a=g[i] &((1<<15)-1); } fft(v1,1);fft(v2,1);fft(v3,1);fft(v4,1); for(int i=0;i<l;i ++ ){<!-- --> h1[i]=v1[i]*v3[i]; h2[i]=v1[i]*v4[i] + v2[i]*v3[i]; h3[i]=v2[i]*v4[i]; } fft(h1,-1);fft(h2,-1);fft(h3,-1); for(int i=0;i<l;i ++ ){<!-- --> w1[i]=(long long)(h1[i].a/l + 0.5)%p; w2[i]=(long long)(h2[i].a/l + 0.5)%p; w3[i]=(long long)(h3[i].a/l + 0.5)%p; ans[i]=(w1[i]*(1<<30)%p + w2[i]*(1<<15)%p+w3[i])%p; } for(int i=0;i<=n + m;i + + ){<!-- --> printf("%lld ",ans[i]); } return 0; }