Use Protostuff to serialize, read and write files, very fast.

This is the fastest of all the methods I’ve tested. It can be used as a simple local file database, and can store binary or plain text files. The specific implementation depends on personal preference. Here I store files in binary mode.

There are many ways to read and write conventional file tools, but for simple reading and writing, the JAVA NIO method is currently the fastest.

FileUtil.java

package protoBuf;

import java.io.*;
import java.nio.ByteBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.util.List;

public class FileUtil {

    public static final int BUFSIZE = 1024 * 8;

    /**
     * Write binary data by appending
     */
    public static void writeByte2File(byte[] bytes, String writePath) {
        try {
            FileOutputStream fos = new FileOutputStream(writePath, true);
            fos.write(bytes);
            fos.flush();
            fos.close();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    /**
     * Added writing binary data without closing the file
     */
    public static void writeByte2FileFlush(byte[] bytes, String writePath) {
        try {
            FileOutputStream fos = new FileOutputStream(writePath, true);
            fos.write(bytes);
            fos.flush();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    /**
     * Java NIO mode append write
      * @param filepath
      * @param contentList the content of the file to be written
      * @param bufferSize single write buffer size default 4M 1024 * 1024 * 4
     */
    public static void write2FileChannel(String filepath, List<String> contentList, Integer bufferSize) {

        bufferSize = null == bufferSize ? 4194304 : bufferSize;
        ByteBuffer buf = ByteBuffer.allocate(bufferSize);
        FileChannel channel = null;
        try {
            File fileTemp = new File(filepath);
            File parent = fileTemp.getParentFile();
            if (!parent.exists()) parent.mkdirs();
            if (!fileTemp.exists()) fileTemp.createNewFile();

            channel = new FileOutputStream(filepath, true).getChannel();

            for (int i = 0; i < contentList.size(); i++) {
                buf.put((contentList.get(i) + "\r\n").getBytes());
            }

            buf.flip();   // switch to readable mode

            while (buf.hasRemaining()) {
                channel.write(buf);
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                channel.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }


    /**
     * Merge files in NIO mode
     */
    public static void mergeFiles(File outFile, String[] files) {
        FileChannel outChannel = null;
        try {
            outChannel = new FileOutputStream(outFile).getChannel();
            for (String f : files) {
                if (null != f) {
                    FileChannel fc = new FileInputStream(f).getChannel();
                    ByteBuffer bb = ByteBuffer.allocate(BUFSIZE);
                    while (fc.read(bb) != -1) {
                        bb.flip();
                        outChannel.write(bb);
                        bb.clear();
                    }
                    fc.close();
                }
            }
        } catch (IOException ioe) {
            ioe.printStackTrace();
        } finally {
            try {
                if (outChannel != null) {
                    outChannel.close();
                }
            } catch (IOException ignore) {
            }
        }
    }

    /**
     * Add binary data writing, use fixed stream, don't close file
     */
    public static FileOutputStream writeByte2FileFlush2Stream(FileOutputStream fos, byte[] bytes, String writePath) {
        try {
            if (fos == null) {
                fos = new FileOutputStream(writePath, true);
            }
            fos.write(bytes);
            fos.flush();
        } catch (Exception e) {
            e.printStackTrace();
        }
        return fos;
    }

    /**
     * The NIO method reads the contents of the file into the memory at one time
     */
    public static byte[] readDataFromFile(String filePath) throws Exception {
        //get all data from file
        RandomAccessFile file = new RandomAccessFile(filePath, "rw");
        FileChannel fileChannel = file.getChannel();
        MappedByteBuffer buffer = fileChannel.map(FileChannel.MapMode.READ_ONLY, 0, fileChannel.size());

        byte res[] = new byte[buffer.capacity()];
        buffer.get(res);
        return res;
    }

    public static void main(String[] args) {
        for(int i=0;i<100;i++){
            writeByte2FileFlush(new String("test"+i).getBytes(), "E:\\testNull.txt");
        }

        try{
            byte[] file = readDataFromFile("E:\\testNull.txt");
            System.out.println("file="+new String(file));
        }catch(Exception e){
            e.printStackTrace();
        }

    }

}

simple object wrapper class

WrapperUtil.java

package protoBuf;

public class WrapperUtil<T> {
    private T data;

    public static <T> WrapperUtil<T> builder(T data) {
        WrapperUtil<T> wrapper = new WrapperUtil<>();
        wrapper.setData(data);
        return wrapper;
    }


    public T getData() {
        return data;
    }

    public void setData(T data) {
        this.data = data;
    }
}

Protostuff’s main tool classes are serialization and deserialization methods, which are the key to speeding up data processing.

ProtoBufUtil.java 

package protoBuf;

import com.google.common.collect.Maps;
import io.protostuff.LinkedBuffer;
import io.protostuff.ProtostuffIOUtil;
import io.protostuff.Schema;
import io.protostuff.runtime.RuntimeSchema;
import org.springframework.objenesis.Objenesis;
import org.springframework.objenesis.ObjenesisStd;

import java.util.*;
import java.util.concurrent.CopyOnWriteArrayList;

public class ProtoBufUtil {

    private static Objenesis objenesis = new ObjenesisStd(true);

    /**
     * A collection of classes that need to be serialized/deserialized using wrapper classes
     */
    private static final Set<Class<?>> WRAPPER_SET = new HashSet<>();

    /**
     * Serialize/deserialize wrapper class objects
     */
    private static final Class<WrapperUtil> WRAPPER_CLASS = WrapperUtil.class;

    /**
     * Serialize/deserialize wrapper class schema objects
     */
    private static final Schema<WrapperUtil> WRAPPER_SCHEMA = RuntimeSchema.createFrom(WRAPPER_CLASS);

    /**
     * Cache object and object schema information collection
     */
    private static final Map<Class<?>, Schema<?>> CACHE_SCHEMA = Maps.newConcurrentMap();

    /**
     * Predefine some objects that Protostuff cannot directly serialize/deserialize
     */
    static {
        WRAPPER_SET.add(List.class);
        WRAPPER_SET.add(ArrayList.class);
        WRAPPER_SET.add(CopyOnWriteArrayList.class);
        WRAPPER_SET.add(LinkedList.class);
        WRAPPER_SET.add(Stack.class);
        WRAPPER_SET.add(Vector.class);

        WRAPPER_SET.add(Map.class);
        WRAPPER_SET.add(HashMap.class);
        WRAPPER_SET.add(TreeMap.class);
        WRAPPER_SET.add(Hashtable.class);
        WRAPPER_SET.add(SortedMap.class);
        WRAPPER_SET.add(Map.class);

        WRAPPER_SET.add(Object.class);
    }


    public ProtoBufUtil() {
    }

    @SuppressWarnings({"unchecked"})
    public static <T> byte[] serializer(T obj) {
        Class<T> cls = (Class<T>) obj.getClass();
        LinkedBuffer buffer = LinkedBuffer.allocate(LinkedBuffer.DEFAULT_BUFFER_SIZE);
        try {
            Schema<T> schema = getSchema(cls);
            return ProtostuffIOUtil.toByteArray(obj, schema, buffer);
        } catch (Exception e) {
            System.out.println("protobuf serializer fail");
            throw new IllegalStateException(e.getMessage(), e);
        } finally {
            buffer.clear();
        }
    }

    public static <T> T deserializer(byte[] bytes, Class<T> clazz) {
        try {
            T message = (T) objenesis.newInstance(clazz);
            Schema<T> schema = getSchema(clazz);
            ProtostuffIOUtil.mergeFrom(bytes, message, schema);
            return message;
        } catch (Exception e) {
            System.out.println("protobuf deserializer fail");
            throw new IllegalStateException(e.getMessage(), e);
        }
    }

    /**
     * Register the Class object that needs to be serialized/deserialized using the wrapper class
      *
      * @param clazz The type of Class object that needs to be wrapped
     */
    public static void registerWrapperClass(Class clazz) {
        WRAPPER_SET.add(clazz);
    }

    /**
     * Get the schema of the serialized object type
      *
      * @param cls the class of the serialized object
      * @param <T> The type of the serialized object
      * @return the schema of the serialized object type
     */
    @SuppressWarnings({"unchecked", "rawtypes"})
    private static <T> Schema<T> getSchema(Class<T> cls) {
        Schema<T> schema = (Schema<T>) CACHE_SCHEMA.get(cls);
        if (schema == null) {
            schema = RuntimeSchema.createFrom(cls);
            CACHE_SCHEMA.put(cls, schema);
        }
        return schema;
    }

    /**
     * serialized object
      *
      * @param obj the object to be serialized
      * @param <T> The type of the serialized object
      * @return the serialized binary array
     */
    @SuppressWarnings("unchecked")
    public static <T> byte[] serializeCollect(T obj) {
        Class<T> clazz = (Class<T>) obj.getClass();
        LinkedBuffer buffer = LinkedBuffer.allocate(LinkedBuffer.DEFAULT_BUFFER_SIZE);
        try {
            Object serializeObject = obj;
            Schema schema = WRAPPER_SCHEMA;
            if (!WRAPPER_SET.contains(clazz)) {
                schema = getSchema(clazz);
            } else {
                serializeObject = WrapperUtil.builder(obj);
            }
            return ProtostuffIOUtil.toByteArray(serializeObject, schema, buffer);
        } catch (Exception e) {
           System.out.println("Exception");
            throw new IllegalStateException(e.getMessage(), e);
        } finally {
            buffer.clear();
        }
    }

    /**
     * deserialize object
      *
      * @param data the binary array that needs to be deserialized
      * @param clazz deserialized object class
      * @param <T> The object type after deserialization
      * @return deserialized object collection
     * SerializeDeserializeWrapper wrapper = SerializeDeserializeWrapper.builder(list);
     * byte[] serializeBytes = ProtostuffUtils.serialize(wrapper);
     * long end4 = System.currentTimeMillis();
     * SerializeDeserializeWrapper deserializeWrapper = ProtostuffUtils.deserialize(serializeBytes, SerializeDeserializeWrapper.class);
     */
    public static <T> T deserializeCollect(byte[] data, Class<T> clazz) {
        try {
            if (!WRAPPER_SET.contains(clazz)) {
                T message = clazz.newInstance();
                Schema<T> schema = getSchema(clazz);
                ProtostuffIOUtil.mergeFrom(data, message, schema);
                return message;
            } else {
                WrapperUtil<T> wrapper = new WrapperUtil<T>();
                ProtostuffIOUtil.mergeFrom(data, wrapper, WRAPPER_SCHEMA);
                return wrapper.getData();
            }
        } catch (Exception e) {
            System.out.println("deserialize exception");
            throw new IllegalStateException(e.getMessage(), e);
        }
    }

    public static byte[] subBytes(byte[] src, int begin, int count) {
        byte[] bs = new byte[count];
        for (int i = begin; i < begin + count; i++) bs[i - begin] = src[i];
        return bs;
    }

    public static byte[] intToByteArray(int i) {
        byte[] result = new byte[4];
        result[0] = (byte) ((i >> 24) & 0xFF);
        result[1] = (byte) ((i >> 16) & 0xFF);
        result[2] = (byte) ((i >> 8) & 0xFF);
        result[3] = (byte) (i & 0xFF);
        return result;
    }

    public static int byteArrayToInt(byte[] bytes) {
        int value = 0;
        for (int i = 0; i < 4; i++) {
            int shift = (3 - i) * 8;
            value += (bytes[i] & 0xFF) << shift;
        }
        return value;
    }
}

The test class customizes a simple data structure to write to a file, read it, deserialize the data and put that data into a JAVA object. Protostuff is currently the fastest for this sequence of operations, and very fast for millions and tens of millions of data.

ProtoUsage.java 

package protoBuf;

import java.io.File;
import java.io.FileOutputStream;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

public class ProtoUsage {

    public static final String filePath = "E:\\testByte";

    //params: list for test
      public static void writeByte2File(List<Product> prodList){
          try{
              if(new File(filePath).exists()){
                  new File(filePath).delete();
              }
              FileOutputStream fos = new FileOutputStream(filePath, true);
              for (Product prod : prodList) {
                  byte data[] = ProtoBufUtil.serializer(prod);
                  byte dataLeng[] = ProtoBufUtil.intToByteArray(data.length);
                  FileUtil.writeByte2FileFlush2Stream(fos, dataLeng, filePath);
                  FileUtil.writeByte2FileFlush2Stream(fos, data, filePath);
              }
          }catch(Exception e){
              e.printStackTrace();
          }

      }

    public static void main(String[] args) {
          try{
              int testCount=5000000;
              List<Product> prodList = new ArrayList<Product>();
              for(int i=0;i<testCount;i++){
                  Product prod = new Product();
                  prod.setId("product="+i);
                  prod.setName("product has a test name: testNo("+i+")");
                  prodList.add(prod);
              }
              long start = System.currentTimeMillis();
              //write test data to file
              writeByte2File(prodList);
              System.out.println("Write data time cost:"+(System.currentTimeMillis()-start));

              //Start reading the file.
               // The biggest advantage of ProtoStuff is very fast serialization and deserialization speed,
               // This saves data processing time for the program. .
               //First read all data at once

              long treatStart = System.currentTimeMillis();
              byte res[] = FileUtil.readDataFromFile(filePath);
              List<Product> resultProd = new ArrayList<Product>();

              //Binary file data structure
              // 0016testtesttesttest0018testestestestteste
              //0016(The length of this data saved)testtesttesttest0018(The length of this data saved)testestestestteste

              int hasRead = 0;//amount of data processed
              byte length[] = new byte[4];//the length of a single data object

              while (res.length != hasRead) {
                  length[0] = res[0 + hasRead];
                  length[1] = res[1 + hasRead];
                  length[2] = res[2 + hasRead];
                  length[3] = res[3 + hasRead];

                  hasRead += 4;
                  int dataLength = ProtoBufUtil.byteArrayToInt(length);
                  byte finalByte[] = ProtoBufUtil.subBytes(res, hasRead, dataLength);

                  Product prod = ProtoBufUtil.deserializer(finalByte, Product.class);
                  resultProd.add(prod);
                  hasRead += dataLength;
              }
              System.out.println("Read and treat Cost time:"+(System.currentTimeMillis()-treatStart));
              System.out.println("list size:"+resultProd.size());
//              resultProd.forEach(System.out::println);
          }catch(Exception e){
              e.printStackTrace();
          }

    }
}

//Object for test
class Product{
    String id;
    String name;

    public String getId() {
        return id;
    }

    public void setId(String id) {
        this.id = id;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    @Override
    public String toString() {
        return "name: " + this.getName() +
                ", id: " + this.getId();
    }
}

Below are the test results. I tested 5 million data. The file is written only once. We used the traditional file writing method in the test, because the operation of writing to the file at a time in the real scene is very small. It would be faster to use NIO or something else.

It can be seen that it takes only 3 seconds to read out 5 million pieces of data and deserialize them one by one. This is the advantage of Protostuff. In some places where only small local file storage is required, the reading, writing and processing speed will be very fast. Fast and very useful.

Write data time cost:20914 millisecond
Read and treat Cost time:3142 millisecond
list size:5000000

发表回复

Thanks for your support to bet365fans!