Skip to content
#

spark

Here are 4,465 public repositories matching this topic...

kahrabian
kahrabian commented Mar 2, 2020

Environment:

  1. Framework: PyTorch
  2. Framework version: 1.3.1
  3. Horovod version: 0.19.0
  4. MPI version: 4.0.2
  5. CUDA version: N/A
  6. NCCL version: N/A
  7. Python version: 3.7.5
  8. OS and version: Mac OS 10.15.2
  9. GCC version: 9.2.0

Checklist:

  1. Did you search issues to find if somebody asked this question before? Yes
  2. If your question is about hang, did you read [this d
cube.js
techieslayj
techieslayj commented Mar 26, 2019

For folks using SciKit Learn version 0.20.3 the Cross Validation function from (ds-cheatsheets/Python/Datacamp/scikit-learn.pdf) should be from sklearn.model_selection.cross_validate if I'm not mistaken. I was running a linear regression algorithm using sklearn v0.20.3 and the sklearn.cross_validation.cross_val_score was not recognized but the aforementioned function was and my program ran with no

flytoylf
flytoylf commented Dec 6, 2019

本地执行./angel-example com.tencent.angel.example.ml.DeepFMLocalExample失败
日志如下:
19/12/06 21:45:10 INFO master.AngelApplicationMaster : write app state over
19/12/06 21:45:10 WARN master.AngelApplicationMaster : App Staging directory is null
19/12/06 21:45:10 INFO master.AngelApplicationMaster : Deleting tmp output directory file:/tmp/work/application_1575639909105_1859751923_ad9a6590-c370-42eb-9a8

flink-learning

flink learning blog. http://www.54tianzhisheng.cn 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》

  • Updated Mar 27, 2020
  • Java
thingsboard

macOS development environment setup: Easy-to-understand instructions with automated setup scripts for developer tools like Vim, Sublime Text, Bash, iTerm, Python data analysis, Spark, Hadoop MapReduce, AWS, Heroku, JavaScript web development, Android development, common data stores, and dev-based OS X defaults.

  • Updated Nov 7, 2019
  • Python

Open Source Fast Scalable Machine Learning Platform For Smarter Applications: Deep Learning, Gradient Boosting & XGBoost, Random Forest, Generalized Linear Modeling (Logistic Regression, Elastic Net), K-Means, PCA, Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

  • Updated Apr 1, 2020
  • Jupyter Notebook
antonkulaga
antonkulaga commented Dec 28, 2017

The configuration looks a bit weird:

 notebook.io.FileSystemNotebookProviderConfigurator {
    notebooks.dir = ${manager.notebooks.dir}
  }

  notebooks {
    ###
    # Server dir (containing notebook files)
    dir = ./notebooks
    dir = ${?NOTEBOOKS_DIR}
  }

  # Configure notebook storage provider
  # Default is FileSystem provider
  # See conf/application-git-storage.conf f
malleshjm
malleshjm commented Apr 30, 2018

Hello,

I was able to run python scripts in dev mode using the steps provided in documentation. but for production, I am not sure which all folders to keep and the process to follow. editing the local conf and local sh files and running the server_deploy script, I was able to generate the server jar. But still i had to manually add the python context and upload my egg file.
Can someone pleas

ramkumarkb
ramkumarkb commented Feb 5, 2020

I have noticed a small error in the documentation around S3 configurations:
https://docs.delta.io/latest/delta-storage.html#amazon-s3

On the read part, it should be load and not save:
spark.read.format("delta").load("s3a://<your-s3-bucket>/<path>/<to>/<delta-table>")

Also, I have successfully tested Delta 0.5.0 with on-premise S3 - https://min.io
There were some quirks around the

mmlspark
ttpro1995
ttpro1995 commented Nov 13, 2019

Version

com.microsoft.ml.spark:mmlspark_2.11:jar:0.18.1
spark= 2.4.3
scala=2.11.12

data (csv with header) https://gist.github.com/ttpro1995/69051647a256af912803c9a16040f43a

download data and save as csv file, put into folder /data/public/HIGGS/higgs.test.predictioncsv

val data = spark.read.option("header","true").option("inferSchema", "true").csv("/data/public/HIGGS

Improve this page

Add a description, image, and links to the spark topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the spark topic, visit your repo's landing page and select "manage topics."

Learn more

You can’t perform that action at this time.