我有一个JSON文件,我想转换为CSV文件。我如何用Python做到这一点?

我试着:

import json
import csv

f = open('data.json')
data = json.load(f)
f.close()

f = open('data.csv')
csv_file = csv.writer(f)
for item in data:
    csv_file.writerow(item)

f.close()

然而,这并没有起作用。我正在使用Django和我收到的错误是:

`file' object has no attribute 'writerow'`

然后我尝试了以下方法:

import json
import csv

f = open('data.json')
data = json.load(f)
f.close()

f = open('data.csv')
csv_file = csv.writer(f)
for item in data:
    f.writerow(item)  # ← changed

f.close()

然后得到错误:

`sequence expected`

样本json文件:

[{
        "pk": 22,
        "model": "auth.permission",
        "fields": {
            "codename": "add_logentry",
            "name": "Can add log entry",
            "content_type": 8
        }
    }, {
        "pk": 23,
        "model": "auth.permission",
        "fields": {
            "codename": "change_logentry",
            "name": "Can change log entry",
            "content_type": 8
        }
    }, {
        "pk": 24,
        "model": "auth.permission",
        "fields": {
            "codename": "delete_logentry",
            "name": "Can delete log entry",
            "content_type": 8
        }
    }, {
        "pk": 4,
        "model": "auth.permission",
        "fields": {
            "codename": "add_group",
            "name": "Can add group",
            "content_type": 2
        }
    }, {
        "pk": 10,
        "model": "auth.permission",
        "fields": {
            "codename": "add_message",
            "name": "Can add message",
            "content_type": 4
        }
    }
]

当前回答

我已经尝试了很多建议的解决方案(也熊猫没有正确地规范化我的JSON),但真正好的是正确解析JSON数据来自Max Berman。

我写了一个改进,以避免每一行都有新列 在解析期间将其放置到现有列。 如果只有一个数据存在,则将值存储为字符串,如果该列有更多值,则将值存储为列表。

它有一个输入。Json文件作为输入,并输出一个output.csv。

import json
import pandas as pd

def flatten_json(json):
    def process_value(keys, value, flattened):
        if isinstance(value, dict):
            for key in value.keys():
                process_value(keys + [key], value[key], flattened)
        elif isinstance(value, list):
            for idx, v in enumerate(value):
                process_value(keys, v, flattened)
                # process_value(keys + [str(idx)], v, flattened)
        else:
            key1 = '__'.join(keys)
            if not flattened.get(key1) is None:
                if isinstance(flattened[key1], list):
                    flattened[key1] = flattened[key1] + [value]
                else:
                    flattened[key1] = [flattened[key1]] + [value]
            else:
                flattened[key1] = value

    flattened = {}
    for key in json.keys():
        k = key
        # print("Key: " + k)
        process_value([key], json[key], flattened)
    return flattened

try:
    f = open("input.json", "r")
except:
    pass
y = json.loads(f.read())
flat = flatten_json(y)
text = json.dumps(flat)
df = pd.read_json(text)
df.to_csv('output.csv', index=False, encoding='utf-8')

其他回答

使用pandas库,这就像使用两个命令一样简单!

df = pd.read_json()

read_json将JSON字符串转换为pandas对象(序列或数据帧)。然后:

df.to_csv()

它既可以返回字符串,也可以直接写入csv文件。请参阅to_csv的文档。

根据之前的冗长回答,我们都应该感谢熊猫提供的这条捷径。

关于非结构化JSON,请参阅这个答案。

编辑: 有人问我一个最小的例子:

import pandas as pd

with open('jsonfile.json', encoding='utf-8') as inputfile:
    df = pd.read_json(inputfile)

df.to_csv('csvfile.csv', encoding='utf-8', index=False)

这段代码应该适用于您,假设您的JSON数据在一个名为data. JSON的文件中。

import json
import csv

with open("data.json") as file:
    data = json.load(file)

with open("data.csv", "w") as file:
    csv_file = csv.writer(file)
    for item in data:
        fields = list(item['fields'].values())
        csv_file.writerow([item['pk'], item['model']] + fields)

如果我们考虑下面的例子,将json格式的文件转换为csv格式的文件。

{
 "item_data" : [
      {
        "item": "10023456",
        "class": "100",
        "subclass": "123"
      }
      ]
}

下面的代码将转换json文件(data3. xml)。Json)转换为CSV文件(data3.csv)。

import json
import csv
with open("/Users/Desktop/json/data3.json") as file:
    data = json.load(file)
    file.close()
    print(data)

fname = "/Users/Desktop/json/data3.csv"

with open(fname, "w", newline='') as file:
    csv_file = csv.writer(file)
    csv_file.writerow(['dept',
                       'class',
                       'subclass'])
    for item in data["item_data"]:
         csv_file.writerow([item.get('item_data').get('dept'),
                            item.get('item_data').get('class'),
                            item.get('item_data').get('subclass')])

上面提到的代码已经在本地安装的pycharm中执行,它已经成功地将json文件转换为csv文件。希望这有助于转换文件。

我可能迟到了,但我想,我已经处理过类似的问题。我有一个json文件,看起来像这样

我只想从这些json文件中提取一些键/值。因此,我编写了下面的代码来提取相同的内容。

    """json_to_csv.py
    This script reads n numbers of json files present in a folder and then extract certain data from each file and write in a csv file.
    The folder contains the python script i.e. json_to_csv.py, output.csv and another folder descriptions containing all the json files.
"""

import os
import json
import csv


def get_list_of_json_files():
    """Returns the list of filenames of all the Json files present in the folder
    Parameter
    ---------
    directory : str
        'descriptions' in this case
    Returns
    -------
    list_of_files: list
        List of the filenames of all the json files
    """

    list_of_files = os.listdir('descriptions')  # creates list of all the files in the folder

    return list_of_files


def create_list_from_json(jsonfile):
    """Returns a list of the extracted items from json file in the same order we need it.
    Parameter
    _________
    jsonfile : json
        The json file containing the data
    Returns
    -------
    one_sample_list : list
        The list of the extracted items needed for the final csv
    """

    with open(jsonfile) as f:
        data = json.load(f)

    data_list = []  # create an empty list

    # append the items to the list in the same order.
    data_list.append(data['_id'])
    data_list.append(data['_modelType'])
    data_list.append(data['creator']['_id'])
    data_list.append(data['creator']['name'])
    data_list.append(data['dataset']['_accessLevel'])
    data_list.append(data['dataset']['_id'])
    data_list.append(data['dataset']['description'])
    data_list.append(data['dataset']['name'])
    data_list.append(data['meta']['acquisition']['image_type'])
    data_list.append(data['meta']['acquisition']['pixelsX'])
    data_list.append(data['meta']['acquisition']['pixelsY'])
    data_list.append(data['meta']['clinical']['age_approx'])
    data_list.append(data['meta']['clinical']['benign_malignant'])
    data_list.append(data['meta']['clinical']['diagnosis'])
    data_list.append(data['meta']['clinical']['diagnosis_confirm_type'])
    data_list.append(data['meta']['clinical']['melanocytic'])
    data_list.append(data['meta']['clinical']['sex'])
    data_list.append(data['meta']['unstructured']['diagnosis'])
    # In few json files, the race was not there so using KeyError exception to add '' at the place
    try:
        data_list.append(data['meta']['unstructured']['race'])
    except KeyError:
        data_list.append("")  # will add an empty string in case race is not there.
    data_list.append(data['name'])

    return data_list


def write_csv():
    """Creates the desired csv file
    Parameters
    __________
    list_of_files : file
        The list created by get_list_of_json_files() method
    result.csv : csv
        The csv file containing the header only
    Returns
    _______
    result.csv : csv
        The desired csv file
    """

    list_of_files = get_list_of_json_files()
    for file in list_of_files:
        row = create_list_from_json(f'descriptions/{file}')  # create the row to be added to csv for each file (json-file)
        with open('output.csv', 'a') as c:
            writer = csv.writer(c)
            writer.writerow(row)
        c.close()


if __name__ == '__main__':
    write_csv()

我希望这能有所帮助。有关此代码如何工作的详细信息,请查看这里

首先,JSON包含嵌套对象,因此通常不能直接转换为CSV。你需要把它改成这样:

{
    "pk": 22,
    "model": "auth.permission",
    "codename": "add_logentry",
    "content_type": 8,
    "name": "Can add log entry"
},
......]

下面是我的代码来生成CSV:

import csv
import json

x = """[
    {
        "pk": 22,
        "model": "auth.permission",
        "fields": {
            "codename": "add_logentry",
            "name": "Can add log entry",
            "content_type": 8
        }
    },
    {
        "pk": 23,
        "model": "auth.permission",
        "fields": {
            "codename": "change_logentry",
            "name": "Can change log entry",
            "content_type": 8
        }
    },
    {
        "pk": 24,
        "model": "auth.permission",
        "fields": {
            "codename": "delete_logentry",
            "name": "Can delete log entry",
            "content_type": 8
        }
    }
]"""

x = json.loads(x)

f = csv.writer(open("test.csv", "wb+"))

# Write CSV Header, If you dont need that, remove this line
f.writerow(["pk", "model", "codename", "name", "content_type"])

for x in x:
    f.writerow([x["pk"],
                x["model"],
                x["fields"]["codename"],
                x["fields"]["name"],
                x["fields"]["content_type"]])

你会得到如下输出:

pk,model,codename,name,content_type
22,auth.permission,add_logentry,Can add log entry,8
23,auth.permission,change_logentry,Can change log entry,8
24,auth.permission,delete_logentry,Can delete log entry,8