Updated May 12, 2023
Introduction to Keras pad_sequences
Keras pad_sequences function is used to pad the sequences with the same length. The keras pad sequence function transforms several sequences into the numpy array. We have provided multiple arguments with keras pad_sequences, in that num_timesteps is a maxlen argument if we have provided it, or it will be the length of the most extended sequence defined in the list of keras sequences.
Key Takeaways
- The keras pad_sequences are used to pad the sequences with the same length. For using the keras pad_sequences, we need to import the tensorflow module.
- When using it, we need to create the object of the sequence we are creating in the code.
What is Keras pad_sequences?
Pad sequences in keras are smaller than num_timesteps, which was padded with the value until it was long. Keras pad_sequences that were longer than num timesteps were truncated to fit into the desired length. The position of truncation or padding is determined by arguments such as padding and truncating. By default, values are removed from the beginning and pre-padded.
When using the pad sequence in Keras, we need to use multiple parameters, the most important of which are padding and truncating. The keras pad sequences utility was used to preprocess the sequential data. Keras pad sequences are used to convert sample sequences to 2D numpy arrays.
How to Use Keras pad_sequences?
We need to install tensorflow and keras modules in our system to use it. Below steps shows how we can use the keras pad_sequences as follows:
1. First, we are installing the tensorflow and keras modules in our system. We are installing the same by using python and pip commands as follows.
Code:
python –m pip install tensorflow
python –m pip install keras
Output:
2. Now, we are checking the installation of the keras and tensorflow modules by importing them using the import keyword as follows.
Code:
Import tensorflow as tf
from keras.layers import Dense
Output:
3. While importing the keras and tensorflow module, we are now defining the sequence for using the keras pad sequences. We are creating the object as seq; this object we are calling at the time of using the function of pad_sequences.
Code:
seq = [[3, 7], [12, 17, 19], [27, 65, 73, 81]]
Output:
4. While defining the sequence array, we are now using the function of keras pad_sequences. In the below example, we are only calling the object we created at the time of the array sequence as follows.
Code:
tf.keras.preprocessing.sequence.pad_sequences(seq)
Output:
5. In the above step, we have only used the object of the sequence parameter; in the below example, we are using an object as well, and we are defining the value as 1. The one value will be appended in the position of the 0 value as follows.
Code:
tf.keras.preprocessing.sequence.pad_sequences(seq, value = 1)
Output:
6. In the below example, we are using the argument name as padding. We can see that after using this parameter, all array numbers will shift first, and the blank value of the array is shifted at the last position.
Code:
tf.keras.preprocessing.sequence.pad_sequences(seq, padding = 'post')
Output:
7. In the below example, we are using the maxlen parameter of the function. While using the maxlen parameter, the pad_sequences function shows the array of the same length we have defined in the value of maxlen arguments.
Code:
tf.keras.preprocessing.sequence.pad_sequences(seq, maxlen = 2)
tf.keras.preprocessing.sequence.pad_sequences(seq, maxlen = 2)
Output:
Keras pad_sequences Arguments
The pad_sequences in keras contains the number of arguments. We need to pass those arguments when using pad_sequences in keras.
Below are the arguments that we are using in it as follows:
- Sequences – This argument is defined as a sequence list that we have defined. Each sequence in it is a list of integer values.
- Maxlen – This is an optional argument used in keras pad_sequences. It will define the maximum length and int of all sequences. If suppose we have not provided, then sequences will be padded to the length of the individual sequence.
- Dtype – This is an optional argument that we are using in it. The default value of this argument is int32. It defines the type of output sequence. For padding the sequence into the specified length strings, we need to define or use the object.
- Padding – By using these arguments, we are padding pre or post-arguments. If suppose we are providing pre, then it will be padding before; if suppose we are providing post, then it will be padding after.
- Truncating – This argument is used to remove the value from the sequences. It will truncate the value from the sequences’ beginning or end.
- Value – These arguments’ optional value is zero. It will contain the float or string as a padding value.
Keras pad_sequences Function
We are using pad_sequences in keras; the below syntax shows how we can use the pad_sequences in keras. In the below syntax, we have defined all the arguments of the pad_sequences as follows.
Syntax:
tf.keras.utils.pad_sequences(
sequences,
maxlen = val,
dtype = 'val',
padding = 'val',
truncating = 'val',
value = val)
In the above syntax, we can see that we need to pass the above arguments while using the function. In the below example, we are passing sequences, values, and padding arguments.
Code:
import tensorflow as tf
seq = [[13, 17], [22, 27, 29], [37, 55, 74, 89]]
tf.keras.preprocessing.sequence.pad_sequences (seq, value = 1, padding = 'pre')
Output:
In the above example, we defined the padding values as pre, which will pad the one value at the beginning. In the below example, we are determining the padding value as a post then it will show the padding value at last.
Code:
import tensorflow as tf
seq = [[13, 17], [22, 27, 29], [37, 55, 74, 89]]
tf.keras.preprocessing.sequence.pad_sequences (seq, value = 1, padding = 'post')
Output:
In the below example, we are defining all arguments of the function.
Code:
import tensorflow as tf
seq = [[13, 17], [22, 27, 29], [37, 55, 74, 89]]
tf.keras.utils.pad_sequences(
seq,
maxlen = 3,
dtype = 'int32',
padding = 'pre',
truncating = 'post',
value = 0.5
)
Output:
Returns
The keras pad_sequences will return the shape with a numpy array of sequences. The below syntax shows how it will return the below values.
The below syntax shows the return value of keras pad_sequences as follows:
Syntax:
tf.keras.utils.pad_sequences(len (sequences), maxlen)
In the above syntax, we are using the object of sequence, which we have defined at the time of creating sequences. We have also defined the maxlen of sequences which we have defined. The below example shows how we can return the value of keras pad_sequences as follows.
Code:
import tensorflow as tf
seq = [[13, 17], [22, 27, 29], [37, 55, 74, 89]]
tf.keras.utils.pad_sequences (
seq,
maxlen = 2 )
Output:
In the below example, we only use a single value to return it.
Code:
import tensorflow as tf
seq = [[13, 17], [22, 27, 29], [37, 55, 74, 89]]
tf.keras.utils.pad_sequences (seq)
Output:
Conclusion
The keras pad_sequences are smaller than num_timesteps which was padded with the value until it will long. The function is used to pad the sequences which contain the same length. The keras pad sequence function transforms several sequences into the numpy array.
Recommended Articles
This is a guide to Keras pad_sequences. Here we discuss the introduction and how to use Keras pad_sequences with Arguments and functions. You may also have a look at the following articles to learn more –