あなたが時間領域エイリアシングを経験しているように私には聞こえます。まず、2つの基本的な事実:
- $ x [n] $は長さ$ N $で、$ h [n] $は長さ$ M $の場合、$ x [n] \ ast h [n] $は長さ$ N + M-1 $。
ファクト1は、どのドメインで畳み込み/フィルタリングを実行するかは関係ありません。したがって、そのスペクトルを "整形"することによって$ x [n] $のDFTを変更すると、暗黙の長さ$ M $を定義する時間領域に等価なフィルタ$ h [n] $が存在します。
- Taking an $N_1$-point IDFT of an $N_2$-point signal for $N_1 < N_2$ results in time-aliasing.
Fact 2 is the dual to frequency aliasing by undersampling. I'll illustrate this as follows: let $x_1[n]$ denote a signal of size $N_1$. Then, if we take an $N_1$-point DFT of $x_1$ and modify its spectrum so the size should be $N_2>N_1$, then taking an $N_1$-point IDFT is equivalent to obtaining the signal $x_2$ given by:
\begin{eqnarray}
x_2[n] = \sum_{p=0}^{\infty}x[n+pN_1]
\end{eqnarray}
This seems to be consistent with what you have mentioned. As you correctly noted, zero padding the original signal is exactly the solution! Essentially just zero-pad such that the terms in the above equation for $p>0$ are all zero-valued. Also, using the ideas above gives you a way to understand the exact amount of zero-padding necessary based on the implicit filter $h[n]$ your frequency domain shaping yields. Good luck!