Ver tipos de datos

Retorna una lista de tuplas con el nombre y el tipo de dato de las columnas.

data.dtypes

#------------- OUTPUT ------------
# [('_c0', 'int'),
#  ('symbol', 'string'),
#  ('data', 'date'),
#  ('open', 'double'),
#  ('high', 'double'),
#  ('low', 'double'),
#  ('close', 'double'),
#  ('volume', 'int'),
#  ('adjusted', 'double'),
#  ('market_cap', 'string'),
#  ('sector', 'string'),
#  ('industry', 'string'),
#  ('exchange', 'string')]

Ver esquema

Devuelve el esquema del dataframe.

data.schema

# -------------- Ouput ------------------
# StructType(
#           List(
#             StructField(_c0,IntegerType,true),
#             StructField(symbol,StringType,true),
#             StructField(data,DateType,true),
#             StructField(open,DoubleType,true),
#             StructField(high,DoubleType,true),
#             StructField(low,DoubleType,true),
#             StructField(close,DoubleType,true),
#             StructField(volume,IntegerType,true),
#             StructField(adjusted,DoubleType,true),
#             StructField(market_cap,StringType,true),
#             StructField(sector,StringType,true),
#             StructField(industry,StringType,true),
#             StructField(exchange,StringType,true)
#           )
#         )

o

df.printSchema()
# ------------ output ------------

# root
#  |-- _c0: integer (nullable = true)
#  |-- symbol: string (nullable = true)
#  |-- data: date (nullable = true)
#  |-- open: double (nullable = true)
#  |-- high: double (nullable = true)
#  |-- low: double (nullable = true)
#  |-- close: double (nullable = true)
#  |-- volume: integer (nullable = true)
#  |-- adjusted: double (nullable = true)
#  |-- market_cap: string (nullable = true)
#  |-- sector: string (nullable = true)
#  |-- industry: string (nullable = true)
#  |-- exchange: string (nullable = true)

Inspeccionar dataframe

Retorna 20 filas por defecto, toma un nĂºmero como parĂ¡metro para definir la cantidad de filas a mostrar.

data.show(22)

Inspeccionar dataframe como lista

Retorna n filas como lista.

data.head(3)

# ---------- OUTPUT ---------

# [
#  Row(_c0=1, symbol='TXG', data=datetime.date(2019, 9, 12), open=54.0, high=58.0, low=51.0, close=52.75, volume=7326300, adjusted=52.75, market_cap='$9.31B', sector='Capital Goods', industry='Biotechnology: Laboratory Analytical Instruments', exchange='NASDAQ'),
#  Row(_c0=2, symbol='TXG', data=datetime.date(2019, 9, 13), open=52.75, high=54.355, low=49.150002, close=52.27, volume=1025200, adjusted=52.27, market_cap='$9.31B', sector='Capital Goods', industry='Biotechnology: Laboratory Analytical Instruments', exchange='NASDAQ'),