Load A Matrix from An ASCII Format File (C++ and Python)

It is common for an scientific program to load an ASCII format matrix file, i.e. an ASCII text file consisting of lines of float numbers separated by whitespaces. In this post, I am gonna show my code (C++ and Python) to load a matrix from an ASCII file.

C++

The following C++ function is to load a matrix from an ASCII file into a vector< vector > object, some kind of “C++ style 2D array”.

#include <istream>
#include <string>
#include <sstream>
#include <vector>

// load matrix from an ascii text file.
void load_matrix(std::istream* is,
        std::vector< std::vector<double> >* matrix,
        const std::string& delim = " \t")
{
    using namespace std;

    string      line;
    string      strnum;

    // clear first
    matrix->clear();

    // parse line by line
    while (getline(*is, line))
    {
        matrix->push_back(vector<double>());

        for (string::const_iterator i = line.begin(); i != line.end(); ++ i)
        {
            // If i is not a delim, then append it to strnum
            if (delim.find(*i) == string::npos)
            {
                strnum += *i;
                if (i + 1 != line.end()) // If it's the last char, do not continue
                    continue;
            }

            // if strnum is still empty, it means the previous char is also a
            // delim (several delims appear together). Ignore this char.
            if (strnum.empty())
                continue;

            // If we reach here, we got a number. Convert it to double.
            double       number;

            istringstream(strnum) >> number;
            matrix->back().push_back(number);

            strnum.clear();
        }
    }
}

// example
#include <fstream>
#include <iostream>

int main()
{
    using namespace std;

    // read the file
    std::ifstream is("input.txt");

    // load the matrix
    std::vector< std::vector<double> > matrix;
    load_matrix(&is, &matrix);

    // print out the matrix
    cout << "The matrix is:" << endl;
    for (std::vector< std::vector<double> >::const_iterator it = matrix.begin(); it != matrix.end(); ++ it)
    {
        for (std::vector<double>::const_iterator itit = it->begin(); itit != it->end(); ++ itit)
            cout << *itit << '\t';

        cout << endl;
    }

    return 0;
}

The code is also available on GitHub Gist.

Python

The Python code loads the matrix into a numpy.matrix object.

def load_matrix_from_file(f):
    """
    This function is to load an ascii format matrix (float numbers separated by
    whitespace characters and newlines) into a numpy matrix object.

    f is a file object or a file path.
    """

    import types
    import numpy

    if type(f) == types.StringType:
        fo = open(f, 'r')
        matrix = load_matrix_from_file(fo)
        fo.close()
        return matrix
    elif type(f) == types.FileType:
        file_content = f.read().strip()
        file_content = file_content.replace('\r\n', ';')
        file_content = file_content.replace('\n', ';')
        file_content = file_content.replace('\r', ';')

        return numpy.matrix(file_content)

    raise TypeError('f must be a file object or a file name.')

The code is also available on GitHub Gist.

If you want to get a nested list instead of such a numpy.matrix object, you can use the following lines to convert the object to a nested list:

matrix = load_matrix_from_file('file_name')
nested_list = matrix.tolist()

9 thoughts on “Load A Matrix from An ASCII Format File (C++ and Python)

  1. grapeot

    Just curious about why you use pointers (istream *) rather than references (istream &) for the input stream in the C++ code… Is that some specific reason or it’s just personal preference? Thanks!

    Reply
    1. Hong Xu

      Just my personal preference. I think passing a variable reference that is going to be changed inside the function body is counter-intuitive. Thus, whenever I see I need to pass in a pointer, I’ll be aware that this variable is going to be changed somehow.

      Reply
      1. grapeot

        Yeah that makes sense. Thanks for your explanation. I also notice you put const before general references, possibly just to save the time of a copy constructor, which is consistent with your “pointer indicating changes” habit. 🙂

        Reply
  2. kammo

    Hi,

    I’ve tested the C++ code, and I noticed that the last number in the line is not added to the row vector. I am assuming that there is no delimiter before the EOL character. The reason for which the last number is not pushed back is that when EOL is reached the push_back instruction at line 40 is skipped. I have fixed this problem by conditioning the “continue” at line 30 in the following way:

    […]
    if (delim.find(*i) == string::npos)
    {
    strnum += *i;
    if(i+1 != line.end())
    continue;
    }
    […]

    Correct me if I’m wrong. 🙂

    Reply
    1. Hong Xu

      You are right that the last number is not read. Probably it’s more elegant to append a delimiter to the end of line in the code before iteration rather than mess up the loop 🙂

      Reply
  3. Borja Ribes Blanquer

    Hello Hong!

    I’m a beginner in C++ programming and I need to load a txt file (containing a huge matrix of floating numbers) into c++. I have copied and pasted your code in a Visual studio 2010 new project. It compiles correctly but when I run it gives me an error.fatal error LNK1561: entry point must be defined. I assume this error is because I need to set a main() function and call the function load_matrix. However, I dont know how to call it because for the function declaration you have included 3 arguments.
    void load_matrix(std::istream* is,
    std::vector< std::vector >* matrix,
    const std::string& delim = ” t”)

    I don’t understand what this arguments are used for.

    In which line of your code can I write the directory where my txt file is located?

    Thank you

    Reply

Leave a Reply

Your email address will not be published.