Updated June 16, 2023
Introduction to Avro Tools
The Avro tool is defined as the ‘tool’ interface, and ‘avro-tools’ is the command-line interface used to execute that interface for assigning alongside the main. The user has utilized the command-line tools to read and write the Avro files. The tools are available in the ‘org.apache.avro.tool’ package, and various tools are available. As we know, Avro is a favored tool that can arrange data in Hadoop, which has a diagrammatic representation. Moreover, it does not depend on language, so Avro can read and write the operations.
List of Avro Tools
There are some tools in Avro in which we can get that by using the command’ $ java -jar ~/avro-tools-1.7.4.jar’, and we can able to use the ‘getName()’ method for listing and ‘getShortDescription()’ method for command listing when we are using the string, and run(InputStream stdin, PrintStream out, PrintStream err, list < string> args) method can be used when we have ‘int’ type modifier. It can run the tools with given arguments; this method has been used with given tools; if we run the tool without parameters, then it will output its usage of it. Moreover, if we use these tools in class, we can generate its constructor.
1. compile
This tool has been creating the Java code for the given schema.
Code:
'Java -jar/path/to/avro-tools-1.11.0.jar compile schema'
2. concat
The Concat tool has worked for chaining the files, and there is no need for re-compressing so that it can concatenate the Avro files with the same and non-reserved schema.
Code:
'$ hadoop jar avro-tools.jar concat'
3. cat
The cat command has extracted the sample from the Avro data file.
Code:
'$ cat part.m.00000.avro'
'$ cat departments.json', if we have specific department data
4. fragtojson
This tool has transformed the input file from Avro binary to JSON format.
Code:
'$ java -jar avro-tools-1.11.0.jar tojson'
Example:
5. fromjson
This tool can interpret JSON records and write the Avro data files; for this conversion, we have a command that can be used without compression.
Code:
$ java -jar avro-tools-1.11.0.jar fromjson
Output:
6. fromtext
This tool can help bring in a text file and convert it into an Avro data file.
7. getmeta
This tool can provide the metadata of an Avro data file and can also be used to scale the schema; it is not given the data in a structured format.
Code:
'$ hadoop jar avro-tools-1.9.0.jar getmeta'
Output:
8. getschema
This tool can provide the schema of the Avro data file and can be used to scale the schema; it gives data in a structured format.
Code:
$ hadoop jar avro-tools-1.9.0.jar getschema
Output:
9. idl
This tool can create a JSON schema from the avro idl file.
Code:
'$ avro-tools.jar idl'
10. induce
This tool has persuaded the schema from the Java class through reflection.
11. jsontofrag
This tool can supply a JSON-encoded avro datum as binary.
12. recodec
This tool can convert the codec of the data file.
13. rpcprotocol
It provides the protocol of an RPC service.
14. rpcreceive
It helps to open the RPC server and pay attention to one message.
15. rpcsend
It can allow sending a single RPC message.
16. tether
It allows us to send a tethered MapReduce job; when we use its class, it allows us to create its constructor.
17. tojson
It can discard the Avro data file as JSON and track per line, which converts avro data into JSON format.
Code:
'$ avro-tools tojson part.m.00000.avro'
Output:
18. totext
It can allow transforming the Avro data file into a text file.
Code:
'$ avro-tools totext part.m.00000.avro'
We must use the document’s extension as it does not work well.
19. trevni_meta
It can discard the metadata of the trevni file as JSON.
20. trevni_random
It can create the trevni, which can have random schema instances.
21. trevni_tojson
It can discard a trevni file as JSON.
22. DataFileRead
This tool has been used to read the data file and throw it to JSON; we can use a constructor in a class of it as with methods get and run.
23. DataFileReader
This tool has been allowed to retrieve data from untrustworthy Avro data files. We can able to create the constructor of it.
24. DataFileWrite
It has allowed the new line to establish the records and write the Avro data files and allows the use of the constructor.
25. IdlToSchemata
It provides data from the JSON of the protocol type via the IDL format; this tool can be extended, allowing to use of the constructor.
26. main
This tool will be a command-line driver, and in the Main class, we can use its constructor by importing Java.lang package.
27. RecordCount
The RecordCountTool has been used for counting the records in any Avro file of folders; in a class of RecordCountTool, we allow creating the constructor of it.
28. SchemaFingerprint
It has been used to create the fingerprints from the schema; if we have created the class of it, then we have to extend this tool, and then we can create a constructor of it.
29. SchemaNormalization
This tool can work for transforming the Avro schema into its canonical format; we can generate its constructor, also.
30. SpecificCompiler
This tool has been used to compile the Avro protocol into the Java classes with the help of a particular compiler of the Avro; we can also use the constructor.
31. CreateRandomFile
This tool can allow generating a file that can be filled with the randomly created instances available in the schema; we can use its constructor.
Conclusion
In this article, we conclude that the Avro tools allow us to read and write, which can be arranged in serialized format, and the Avro tool provides the interface to read and write the file Avro provides, so this article will help to understand the concept of Avro tool.
Recommended Articles
This is a guide to Avro Tools. Here we discuss the introduction and the list of Avro tools for a better understanding. You may also have a look at the following articles to learn more –