我有一个JSON文件,我想转换为CSV文件。我如何用Python做到这一点?

我试着:

import json
import csv

f = open('data.json')
data = json.load(f)
f.close()

f = open('data.csv')
csv_file = csv.writer(f)
for item in data:
    csv_file.writerow(item)

f.close()

然而,这并没有起作用。我正在使用Django和我收到的错误是:

`file' object has no attribute 'writerow'`

然后我尝试了以下方法:

import json
import csv

f = open('data.json')
data = json.load(f)
f.close()

f = open('data.csv')
csv_file = csv.writer(f)
for item in data:
    f.writerow(item)  # ← changed

f.close()

然后得到错误:

`sequence expected`

样本json文件:

[{
        "pk": 22,
        "model": "auth.permission",
        "fields": {
            "codename": "add_logentry",
            "name": "Can add log entry",
            "content_type": 8
        }
    }, {
        "pk": 23,
        "model": "auth.permission",
        "fields": {
            "codename": "change_logentry",
            "name": "Can change log entry",
            "content_type": 8
        }
    }, {
        "pk": 24,
        "model": "auth.permission",
        "fields": {
            "codename": "delete_logentry",
            "name": "Can delete log entry",
            "content_type": 8
        }
    }, {
        "pk": 4,
        "model": "auth.permission",
        "fields": {
            "codename": "add_group",
            "name": "Can add group",
            "content_type": 2
        }
    }, {
        "pk": 10,
        "model": "auth.permission",
        "fields": {
            "codename": "add_message",
            "name": "Can add message",
            "content_type": 4
        }
    }
]

当前回答

不幸的是,我没有足够的声誉来为@Alec McGail的惊人回答做出小小的贡献。 我正在使用Python3,我需要将映射转换为@Alexis R注释后面的列表。

另外,我发现csv作者添加了一个额外的CR文件(我有一个空行每一行与数据在csv文件)。根据@Jason R. Coombs对这个帖子的回答,解决方法非常简单: CSV在Python中添加了一个额外的回车

您只需将lineterminator='\n'参数添加到csv.writer。它将是:csv_w = csv。Writer (out_file, lineterminator='\n')

其他回答

JSON可以表示各种各样的数据结构——JS的“对象”大致类似于Python的dict(带有字符串键),JS的“数组”大致类似于Python列表,只要最后的“叶子”元素是数字或字符串,你就可以嵌套它们。

CSV本质上只能表示一个2-D表——可选的第一行是“标题”,即“列名”,这可以使表可解释为字典列表,而不是正常的解释,一个列表的列表(同样,“叶子”元素可以是数字或字符串)。

So, in the general case, you can't translate an arbitrary JSON structure to a CSV. In a few special cases you can (array of arrays with no further nesting; arrays of objects which all have exactly the same keys). Which special case, if any, applies to your problem? The details of the solution depend on which special case you do have. Given the astonishing fact that you don't even mention which one applies, I suspect you may not have considered the constraint, neither usable case in fact applies, and your problem is impossible to solve. But please do clarify!

我可能迟到了,但我想,我已经处理过类似的问题。我有一个json文件,看起来像这样

我只想从这些json文件中提取一些键/值。因此,我编写了下面的代码来提取相同的内容。

    """json_to_csv.py
    This script reads n numbers of json files present in a folder and then extract certain data from each file and write in a csv file.
    The folder contains the python script i.e. json_to_csv.py, output.csv and another folder descriptions containing all the json files.
"""

import os
import json
import csv


def get_list_of_json_files():
    """Returns the list of filenames of all the Json files present in the folder
    Parameter
    ---------
    directory : str
        'descriptions' in this case
    Returns
    -------
    list_of_files: list
        List of the filenames of all the json files
    """

    list_of_files = os.listdir('descriptions')  # creates list of all the files in the folder

    return list_of_files


def create_list_from_json(jsonfile):
    """Returns a list of the extracted items from json file in the same order we need it.
    Parameter
    _________
    jsonfile : json
        The json file containing the data
    Returns
    -------
    one_sample_list : list
        The list of the extracted items needed for the final csv
    """

    with open(jsonfile) as f:
        data = json.load(f)

    data_list = []  # create an empty list

    # append the items to the list in the same order.
    data_list.append(data['_id'])
    data_list.append(data['_modelType'])
    data_list.append(data['creator']['_id'])
    data_list.append(data['creator']['name'])
    data_list.append(data['dataset']['_accessLevel'])
    data_list.append(data['dataset']['_id'])
    data_list.append(data['dataset']['description'])
    data_list.append(data['dataset']['name'])
    data_list.append(data['meta']['acquisition']['image_type'])
    data_list.append(data['meta']['acquisition']['pixelsX'])
    data_list.append(data['meta']['acquisition']['pixelsY'])
    data_list.append(data['meta']['clinical']['age_approx'])
    data_list.append(data['meta']['clinical']['benign_malignant'])
    data_list.append(data['meta']['clinical']['diagnosis'])
    data_list.append(data['meta']['clinical']['diagnosis_confirm_type'])
    data_list.append(data['meta']['clinical']['melanocytic'])
    data_list.append(data['meta']['clinical']['sex'])
    data_list.append(data['meta']['unstructured']['diagnosis'])
    # In few json files, the race was not there so using KeyError exception to add '' at the place
    try:
        data_list.append(data['meta']['unstructured']['race'])
    except KeyError:
        data_list.append("")  # will add an empty string in case race is not there.
    data_list.append(data['name'])

    return data_list


def write_csv():
    """Creates the desired csv file
    Parameters
    __________
    list_of_files : file
        The list created by get_list_of_json_files() method
    result.csv : csv
        The csv file containing the header only
    Returns
    _______
    result.csv : csv
        The desired csv file
    """

    list_of_files = get_list_of_json_files()
    for file in list_of_files:
        row = create_list_from_json(f'descriptions/{file}')  # create the row to be added to csv for each file (json-file)
        with open('output.csv', 'a') as c:
            writer = csv.writer(c)
            writer.writerow(row)
        c.close()


if __name__ == '__main__':
    write_csv()

我希望这能有所帮助。有关此代码如何工作的详细信息,请查看这里

我假设您的JSON文件将解码为字典列表。首先,我们需要一个将JSON对象扁平化的函数:

def flattenjson(b, delim):
    val = {}
    for i in b.keys():
        if isinstance(b[i], dict):
            get = flattenjson(b[i], delim)
            for j in get.keys():
                val[i + delim + j] = get[j]
        else:
            val[i] = b[i]
            
    return val

在JSON对象上运行这段代码的结果:

flattenjson({
    "pk": 22, 
    "model": "auth.permission", 
    "fields": {
      "codename": "add_message", 
      "name": "Can add message", 
      "content_type": 8
    }
  }, "__")

is

{
    "pk": 22, 
    "model": "auth.permission", 
    "fields__codename": "add_message", 
    "fields__name": "Can add message", 
    "fields__content_type": 8
}

对JSON对象输入数组中的每个dict应用此函数后:

input = map(lambda x: flattenjson( x, "__" ), input)

并查找相关的列名:

columns = [x for row in input for x in row.keys()]
columns = list(set(columns))

在CSV模块中运行这个并不难:

with open(fname, 'wb') as out_file:
    csv_w = csv.writer(out_file)
    csv_w.writerow(columns)

    for i_r in input:
        csv_w.writerow(map(lambda x: i_r.get(x, ""), columns))

Alec的回答很好,但在存在多层嵌套的情况下行不通。下面是一个支持多层嵌套的修改版本。如果嵌套对象已经指定了自己的键(例如Firebase Analytics / BigTable / BigQuery数据),它也会使头名称更好一些:

"""Converts JSON with nested fields into a flattened CSV file.
"""

import sys
import json
import csv
import os

import jsonlines

from orderedset import OrderedSet

# from https://stackoverflow.com/a/28246154/473201
def flattenjson( b, prefix='', delim='/', val=None ):
  if val is None:
    val = {}

  if isinstance( b, dict ):
    for j in b.keys():
      flattenjson(b[j], prefix + delim + j, delim, val)
  elif isinstance( b, list ):
    get = b
    for j in range(len(get)):
      key = str(j)

      # If the nested data contains its own key, use that as the header instead.
      if isinstance( get[j], dict ):
        if 'key' in get[j]:
          key = get[j]['key']

      flattenjson(get[j], prefix + delim + key, delim, val)
  else:
    val[prefix] = b

  return val

def main(argv):
  if len(argv) < 2:
    raise Error('Please specify a JSON file to parse')

  print "Loading and Flattening..."
  filename = argv[1]
  allRows = []
  fieldnames = OrderedSet()
  with jsonlines.open(filename) as reader:
    for obj in reader:
      # print 'orig:\n'
      # print obj
      flattened = flattenjson(obj)
      #print 'keys: %s' % flattened.keys()
      # print 'flattened:\n'
      # print flattened
      fieldnames.update(flattened.keys())
      allRows.append(flattened)

  print "Exporting to CSV..."
  outfilename = filename + '.csv'
  count = 0
  with open(outfilename, 'w') as file:
    csvwriter = csv.DictWriter(file, fieldnames=fieldnames)
    csvwriter.writeheader()
    for obj in allRows:
      # print 'allRows:\n'
      # print obj
      csvwriter.writerow(obj)
      count += 1

  print "Wrote %d rows" % count



if __name__ == '__main__':
  main(sys.argv)

这工作得相对较好。 它将json压缩成csv文件。 嵌套元素被管理:)

这是python 3的

import json

o = json.loads('your json string') # Be careful, o must be a list, each of its objects will make a line of the csv.

def flatten(o, k='/'):
    global l, c_line
    if isinstance(o, dict):
        for key, value in o.items():
            flatten(value, k + '/' + key)
    elif isinstance(o, list):
        for ov in o:
            flatten(ov, '')
    elif isinstance(o, str):
        o = o.replace('\r',' ').replace('\n',' ').replace(';', ',')
        if not k in l:
            l[k]={}
        l[k][c_line]=o

def render_csv(l):
    ftime = True

    for i in range(100): #len(l[list(l.keys())[0]])
        for k in l:
            if ftime :
                print('%s;' % k, end='')
                continue
            v = l[k]
            try:
                print('%s;' % v[i], end='')
            except:
                print(';', end='')
        print()
        ftime = False
        i = 0

def json_to_csv(object_list):
    global l, c_line
    l = {}
    c_line = 0
    for ov in object_list : # Assumes json is a list of objects
        flatten(ov)
        c_line += 1
    render_csv(l)

json_to_csv(o)

享受。