Welcome Guest ( Log In | Register )

Outline · [ Standard ] · Linear+

Science WTA Obtaining Patient Chest X Ray from Hospital, For FYP: Image Detection for COVID-19

views
     
TSiSean
post Sep 25 2020, 08:28 AM, updated 5y ago

iz old liao.
*******
Senior Member
4,496 posts

Joined: Jun 2011



Hi there, would like to ask does anyone know how to obtain a large quantity of Chest X Rays that are labelled, in terms of Healthy/Normal, Pneumonia/Respiratory Infections and Covid19?

I am doing my FYP on Covid19 image detection from Machine Learning. Currently on Github there is CovidNet by LindaWang on her approach in detecting Covid19, that was quite successful in detecting Covid patients.

But I wont be applying her technique as it is too complexed and I would be most likely going to deploy my own model using TensorFlow. As I'm quite new to this topic.

Also, I realized that the number if Normal Healthy and Covid 19 Chest X ray images are quite limited within her dataset, also there are a lot of low resolution images within the dataset which I myself have a hard time differentiating the images. I'm not sure even the machine is able to detect such fuzzy images.

So I was wondering was it possible to obtain them from our Government and Private Hospitals through their radiology department?
But I'm not really sure how to contact them. Any advice would be nice.

I would need about a few hundred or thousands of images from either posterior anterior(PA) or anteriorposterior(AP) view.

MRI/CT images are also acceptable.

» Click to show Spoiler - click again to hide... «


This post has been edited by iSean: Sep 25 2020, 08:48 AM
anakkk
post Sep 25 2020, 08:39 AM

Look at all my stars!!
*******
Senior Member
2,111 posts

Joined: Apr 2013
those are confidential, not sure if you can get it or not, maybe need your dean to write a letter to hospital for collaboration
TSiSean
post Sep 25 2020, 08:47 AM

iz old liao.
*******
Senior Member
4,496 posts

Joined: Jun 2011



QUOTE(anakkk @ Sep 25 2020, 08:39 AM)
those are confidential, not sure if you can get it or not, maybe need your dean to write a letter to hospital for collaboration
*
Thank you for your reply. I'm just an undergrad and fairly new to this, so I think I would approach them eventually and ask.
guess0410
post Oct 28 2020, 12:05 AM

Getting Started
**
Junior Member
103 posts

Joined: Aug 2010
you need to ask your supervisor to contact a physician (doctor) in Infectious disease clinic for a collaboration. The doc need to apply for ethic. Then the data can be released to you.

just my personal opinion, most likely the chest X-ray of covid patient is the same with others viral pneumonia. May be you should consider CT
TSiSean
post Oct 30 2020, 07:34 PM

iz old liao.
*******
Senior Member
4,496 posts

Joined: Jun 2011



QUOTE(guess0410 @ Oct 28 2020, 12:05 AM)
you need to ask your supervisor to contact a physician (doctor) in Infectious disease clinic for a collaboration. The doc need to apply for ethic. Then the data can be released to you.

just my personal opinion, most likely the chest X-ray of covid patient is the same with others viral pneumonia. May be you should consider CT
*
Well, hopefully someone in Lowyat here has connection sweat.gif
Not sure my uni lecturers has that much connections or not.
tboxmy
post Oct 30 2020, 09:32 PM

Casual
***
Junior Member
478 posts

Joined: Oct 2006


QUOTE(iSean @ Oct 30 2020, 07:34 PM)
Well, hopefully someone in Lowyat here has connection  sweat.gif
Not sure my uni lecturers has that much connections or not.
*
Possibility is slim to find some inference specific to C19 via xrays, but it would be a great find. Just a thought.

What I want to say is, you hold Lowyat with such a high degree of confidence!
guess0410
post Oct 31 2020, 12:07 AM

Getting Started
**
Junior Member
103 posts

Joined: Aug 2010
QUOTE(iSean @ Oct 30 2020, 07:34 PM)
Well, hopefully someone in Lowyat here has connection  sweat.gif
Not sure my uni lecturers has that much connections or not.
*
your chance is low.. as a student is hard to convince them.
It involve a lot of hard work. to apply the ethic need money, to dig out all related files need time.
And what is the return to occupy their precious time?

if your SV is really interested and confident, sent out invitation email to all pulmonologist and infectious disease doctor.
Target those young, and working in teaching hospital ( there are 5 in Malaysia). Most email can be found online.
You might have higher chances.

pipedream
post Jan 15 2021, 06:10 PM

Look at all my stars!!
*******
Senior Member
2,353 posts

Joined: Dec 2006
Late to the party but your project caught my interest because I thought of doing the same thing last time

What's your level of project? If its an undergraduate FYP project then you don't need to go into that level of detail starting from scratch to rebuild a model

Here's my take

Instead of retraining the entire model, why not reuse the model from LindaWang or train your own model with the COVID image set from LindaWang github then use the model to predict on small set of chest x-ray obtained locally.

It would be indefinitely easier to get say like 20 chest x-rays than thousands of them

Also, training on a large dataset does not mean your model would be accurate in detecting unseen datasets (overfitting).

It would be more worth of a publication and project if you could improve on detection accuracy by optimization of models instead of building on a new model
TSiSean
post Jan 15 2021, 06:53 PM

iz old liao.
*******
Senior Member
4,496 posts

Joined: Jun 2011



QUOTE(pipedream @ Jan 15 2021, 06:10 PM)
Late to the party but your project caught my interest because I thought of doing the same thing last time

What's your level of project? If its an undergraduate FYP project then you don't need to go into that level of detail starting from scratch to rebuild a model

Here's my take

Instead of retraining the entire model, why not reuse the model from LindaWang or train your own model with the COVID image set from LindaWang github then use the model to predict on small set of chest x-ray obtained locally.

It would be indefinitely easier to get say like 20 chest x-rays than thousands of them

Also, training on a large dataset does not mean your model would be accurate in detecting unseen datasets (overfitting).

It would be more worth of a publication and project if you could improve on detection accuracy by optimization of models instead of building on a new model
*
Arlo, yup abit late the party, but my project is still on-going. haha.
It is undergraduate level.

Well I think for now I just use transfer learning, see how model performs, then maybe build new layers on top of it.
And using Linda Wang's model as a benchmark.

So far I obtained a small dataset from UM, that will be used for my Validation Dataset.

So far my dataset is like :

ContentCOVIDNormalSouce
Training200200 - COVIDx (LindaWang)
Testing100100 - COVIDx (LindaWang)
Validation100100 - UM

I'm running on a model on PyTorch in Google Colab that someone else built already built already taken up like 12 hours to run to 18 iterations and disconnects....
https://blog.paperspace.com/fighting-corona...-19-classifier/

For the last part, I only worried that my model isn't quick to train, since I lack resources for a Proper GPU.... rclxub.gif I'm running my model based on my personal dataset for the above guide ^. But I didn't manage to reach the Gradient-CAM part, to actually visualize what my model is actually doing.
I am only running it on my laptop. sweat.gif Also I'm particular new to this field, since COVID-19 pretty much wrecked my plans for doing Physical Project for my FYP.
No choice thought it was interesting and took up this topic. Until realize it is a nightmare waiting for results...

Also your last advice is transfer learning, and optimize those pre-built models with adding new layers is it?
Anyhow, would like to receive any guidance I can get. As I do lack mentorship / good advice for my project.

This post has been edited by iSean: Jan 15 2021, 06:56 PM
pipedream
post Jan 15 2021, 07:10 PM

Look at all my stars!!
*******
Senior Member
2,353 posts

Joined: Dec 2006
QUOTE(iSean @ Jan 15 2021, 06:53 PM)
Arlo, yup abit late the party, but my project is still on-going. haha.
It is undergraduate level.

Well I think for now I just use transfer learning, see how model performs, then maybe build new layers on top of it.
And using Linda Wang's model as a benchmark.

So far I obtained a small dataset from UM, that will be used for my Validation Dataset.

So far my dataset is like :

ContentCOVIDNormalSouce
Training200200 - COVIDx (LindaWang)
Testing100100 - COVIDx (LindaWang)
Validation100100 - UM

I'm running on a model on PyTorch in Google Colab that someone else built already built already taken up like 12 hours to run to 18 iterations and disconnects....
https://blog.paperspace.com/fighting-corona...-19-classifier/

For the last part, I only worried that my model isn't quick to train, since I lack resources for a Proper GPU.... rclxub.gif I'm running my model based on my personal dataset for the above guide ^. But I didn't manage to reach the Gradient-CAM part, to actually visualize what my model is actually doing.
I am only running it on my laptop.  sweat.gif  Also I'm particular new to this field, since COVID-19 pretty much wrecked my plans for doing Physical Project for my FYP.
No choice thought it was interesting and took up this topic. Until realize it is a nightmare waiting for results...

Also your last advice is transfer learning, and optimize those pre-built models with adding new layers is it?
Anyhow, would like to receive any guidance I can get. As I do lack mentorship / good advice for my project.
*
You should try Keras. Its actually way easier than PyTorch. I have not use PyTorch before but the code looks complicated

Keras is super easy and modular to use

https://blog.keras.io/building-powerful-ima...ittle-data.html

CUDA is supported on a lot of nvidia models, even my 750m laptop gpu has it, and I think google colab doesnt have CUDA support and running on a single cpu core thats why its so slow. Try using your PC or something. Or ask your SV to borrow university resources.

You are correct. Your FYP would be easier and more worthwhile for publication if you just focus your effort on optimizing a pre-trained model. Use the UM sets to try to further tune the model. Try to get the LindaWang model from her github and start from there. You can save plenty of time without rebuilding the model FYI.

Edit: Also way too complicated for a FYP project in undergrad. Kudos to you but don't overdo things. Your project is actually at least a Masters level even on optimization.

This post has been edited by pipedream: Jan 15 2021, 07:13 PM
TSiSean
post Jan 15 2021, 07:49 PM

iz old liao.
*******
Senior Member
4,496 posts

Joined: Jun 2011



QUOTE(pipedream @ Jan 15 2021, 07:10 PM)
You should try Keras. Its actually way easier than PyTorch. I have not use PyTorch before but the code looks complicated

Keras is super easy and modular to use

https://blog.keras.io/building-powerful-ima...ittle-data.html

CUDA is supported on a lot of nvidia models, even my 750m laptop gpu has it, and I think google colab doesnt have CUDA support and running on a single cpu core thats why its so slow. Try using your PC or something. Or ask your SV to borrow university resources.

You are correct. Your FYP would be easier and more worthwhile for publication if you just focus your effort on optimizing a pre-trained model. Use the UM sets to try to further tune the model. Try to get the LindaWang model from her github and start from there. You can save plenty of time without rebuilding the model FYI.

Edit: Also way too complicated for a FYP project in undergrad. Kudos to you but don't overdo things. Your project is actually at least a Masters level even on optimization.
*
sweat.gif hopefully I just make it out alive and passed. really bitten too much I chewed.

Colab should have their eGPU I think? As I thought it is web training.
Linda Wang's from what I saw doesn't publish their model to their public.
I can try attempt to email to her to get her code, if she replies.

My problem with TensorFlow is that only few people I saw actually did some models with it.
And I ain't that proficient with all those Python Libraries.

And I haven't find someone with a proper guide on auto tuning the hyperparameters ...
or use AI explainability / to visualize the heatmaps on how the model makes it respond.

Solely looking at training / testing / validation losses and accuracy doesn't really help for me...

Here's my training results for 20/60 epochs so far, before it crashed.

» Click to show Spoiler - click again to hide... «

pipedream
post Jan 15 2021, 07:57 PM

Look at all my stars!!
*******
Senior Member
2,353 posts

Joined: Dec 2006
Here's my take on how your project should proceed

It is redundant that you are building another new model with LindaWang image set because she already had one optimized. Don't waste your time and resources redoing something.

You have 200 images to play with, honestly I feel that is rather enough to build your own model

1. Train a new model on UM data with a small portion of holdout data for validation
2. Use LindaWang's optimized model and predict on my holdout data and at the same time use my model trained on UM data to predict on LindaWang's data set
3. Compare LindaWang's optimized model and mine

^ Honestly I feel that is already enough for a FYP project

If you want to go further

4. Fine tune my model with various methods (This is an entirely new project already - Stop at step 3)
- Hypertuning
- Add/remove layers
- Image processing

5. Go back to step 3

Edit1: Going through your protocol

https://blog.paperspace.com/fighting-corona...-19-classifier/

You can reuse almost the same method

Except
CODE
Define the Model
We now define our model. We use the pretrained VGG-19 with batch normalization as our model. We then replace its final linear layer with one having 2 neurons at its output, and perform transfer learning over our dataset.

We use cross entropy loss as our objective function.


You change the pretrained model to be Lindawang's one

My suggestion to you is to stop whatever you are doing right now. Download and play around with lindawang's model, from her code she is using tensorflow, so get familiarize yourself with that. After you are familiar with how TF works and how to build models with TF, then proceed with step 1 but using lindawang's model as base. Then your objective and goal would be to compare whether your image set improves lindawang's model.



This post has been edited by pipedream: Jan 15 2021, 08:27 PM
pipedream
post Jan 15 2021, 08:09 PM

Look at all my stars!!
*******
Senior Member
2,353 posts

Joined: Dec 2006
QUOTE(iSean @ Jan 15 2021, 07:49 PM)
sweat.gif hopefully I just make it out alive and passed. really bitten too much I chewed.

Colab should have their eGPU I think? As I thought it is web training.
Linda Wang's from what I saw doesn't publish their model to their public.
I can try attempt to email to her to get her code, if she replies.

My problem with TensorFlow is that only few people I saw actually did some models with it.
And I ain't that proficient with all those Python Libraries.

And I haven't find someone with a proper guide on auto tuning the hyperparameters ...
or use AI explainability / to visualize the heatmaps on how the model makes it respond.

Solely looking at training / testing / validation losses and accuracy doesn't really help for me...

Here's my training results for 20/60 epochs so far, before it crashed.

» Click to show Spoiler - click again to hide... «

*
The pretrained models.

COVIDNet-CXR models (COVID-19 detection using chest x-rays): https://github.com/lindawangg/COVID-Net/blo.../docs/models.md
COVIDNet-CT models (COVID-19 detection using chest CT scans): https://github.com/haydengunraj/COVIDNet-CT.../docs/models.md
COVIDNet-S models (COVID-19 lung severity assessment using chest x-rays): https://github.com/lindawangg/COVID-Net/blo.../docs/models.md

TSiSean
post Jan 15 2021, 08:37 PM

iz old liao.
*******
Senior Member
4,496 posts

Joined: Jun 2011



QUOTE(pipedream @ Jan 15 2021, 07:57 PM)
Here's my take on how your project should proceed

It is redundant that you are building another new model with LindaWang image set because she already had one optimized. Don't waste your time and resources redoing something.

You have 200 images to play with, honestly I feel that is rather enough to build your own model

1. Train a new model on UM data with a small portion of holdout data for validation
2. Use LindaWang's optimized model and predict on my holdout data and at the same time use my model trained on UM data to predict on LindaWang's data set
3. Compare LindaWang's optimized model and mine

^ Honestly I feel that is already enough for a FYP project

If you want to go further

4. Fine tune my model with various methods (This is an entirely new project already - Stop at step 3)
- Hypertuning
- Add/remove layers
- Image processing

5. Go back to step 3
*
Sorry ah I kind of slow to comprehend the machine learning testing methodology. Let me know whether did I get your meaning correct.


DatasetTrainingTestingValidation
COVID801010
Normal801010

Step 1
Let say, I take/holdout 10 each for COVID and and NORMAL to validate Linda's Wang model.
I get the parameters for accuracy, sensitivity, specificity and F1-scores from the confusion matrix from Linda Wang's model.

Step 2
Then I develop my own personal model, training using the training dataset of 160 images, after deducting the 20 images for testing in the training process and use the remaining 20 to do validation.
Obtain the parameters for accuracy, sensitivity, specificity and F1-scores from the confusion matrix from my own model using the validation set?

And compared the parameters from my model and her model?

This post has been edited by iSean: Jan 15 2021, 08:42 PM
pipedream
post Jan 15 2021, 08:58 PM

Look at all my stars!!
*******
Senior Member
2,353 posts

Joined: Dec 2006
QUOTE(iSean @ Jan 15 2021, 08:37 PM)
Sorry ah I kind of slow to comprehend the machine learning testing methodology. Let me know whether did I get your meaning correct.
DatasetTrainingTestingValidation
COVID801010
Normal801010

Step 1
Let say, I take/holdout 10 each for COVID and and NORMAL to validate Linda's Wang model.
I get the parameters for accuracy, sensitivity, specificity and F1-scores from the confusion matrix from Linda Wang's model.

Step 2
Then I develop my own personal model, training using the training dataset of 160 images, after deducting the 20 images for testing in the training process and use the remaining 20 to do validation.
Obtain the parameters for accuracy, sensitivity, specificity and F1-scores from the confusion matrix from my own model using the validation set?

And compared the parameters from my model and her model?
*
Go read up on deep learning. Get your fundamentals down first before starting. Play and build with simple models to practice first.

Step 1

Split your data set

Its 80-20

80% train
20% test

First thing first:

The test data set should never be use to train your model ever. Even during optimization you should NEVER optimize/tune your model based on the result of the test data set.

These randomized 20 test images will be predicted by LW model and take the predicted accuracy etc value

Next, take randomized 20 test images from LW image set and again predict using LW model <- this will be your control, your goal is to beat or match this

Step 2

Take the 80% test set to build your model, you can do the basic grid search hypertuning etc to get the best validation accuracy (go read up on cross validation).

Use the optimized model predict on UM test set and LW test set

Compare your result with LW model

If you want to do further stuff for your project you can compare the accuracy of your models built with various pretrained models (VGG19 is one of the pretrained model mentioned) to see how the accuracy changes

Complete your project with that. Its an undergrad project, if you are interested to further this wait till you reach postgraduate.

This post has been edited by pipedream: Jan 15 2021, 09:00 PM
TSiSean
post Jan 15 2021, 09:42 PM

iz old liao.
*******
Senior Member
4,496 posts

Joined: Jun 2011



QUOTE(pipedream @ Jan 15 2021, 08:58 PM)
Go read up on deep learning. Get your fundamentals down first before starting. Play and build with simple models to practice first.

Step 1

Split your data set

Its 80-20

80% train
20% test

First thing first:

The test data set should never be use to train your model ever. Even during optimization you should NEVER optimize/tune your model based on the result of the test data set.

These randomized 20 test images will be predicted by LW model and take the predicted accuracy etc value

Next, take randomized 20 test images from LW image set and again predict using LW model <- this will be your control, your goal is to beat or match this

Step 2

Take the 80% test set to build your model, you can do the basic grid search hypertuning etc to get the best validation accuracy (go read up on cross validation).

Use the optimized model predict on UM test set and LW test set

Compare your result with LW model

If you want to do further stuff for your project you can compare the accuracy of your models built with various pretrained models (VGG19 is one of the pretrained model mentioned) to see how the accuracy changes

Complete your project with that. Its an undergrad project, if you are interested to further this wait till you reach postgraduate.
*
Hopefully I'm not wasting your breathe. I really appreciate your time explaining to me all these, as I don't have someone to guide me through all these....
[If you don't mind guiding, I think I can ask my supervisor to add your name into my thesis if you wanted.]

Back to the topic:


Problem is they always the online dataset stored into TensorFlow sweat.gif
Then this weird like called "(X_train, y_train), (X_test, y_test) = mnist.load_data()" which makes my life miserable when splitting the data, as I have no idea how they actually split the data.

Not mistaken "y" are labels/names and "x" are images.
===================================================

Also, TensorFlow normally uses a ImageDataGenerator so I also don't know which data they take.

Also the terminology between "testing" and "validation" confuses me from time to time...

So let me get this straight, when people mention of using 80% as training data it includes the validation data comprising of (20%) for it train the model correct?

user posted image

So the model basically fine-tunes itself from the code with the training data of 80%, from the automatic splitting of the ImageDataGenerator?


Then the testing data is the data the model "never" seen before. And it is Feed into the model afterwards to see how well the model is performing?
Means, I should technically export out a "Model", then manually feed images into the Model to get the results from Testing and obtain the predicted accuracy etc value?

pipedream
post Jan 15 2021, 10:11 PM

Look at all my stars!!
*******
Senior Member
2,353 posts

Joined: Dec 2006
QUOTE(iSean @ Jan 15 2021, 09:42 PM)
Hopefully I'm not wasting your breathe. I really appreciate your time explaining to me all these, as I don't have someone to guide me through all these....
[If you don't mind guiding, I think I can ask my supervisor to add your name into my thesis if you wanted.]

Back to the topic:
Problem is they always the online dataset stored into TensorFlow  sweat.gif
Then this weird like called "(X_train, y_train), (X_test, y_test) = mnist.load_data()" which makes my life miserable when splitting the data, as I have no idea how they actually split the data.

Not mistaken "y" are labels/names and "x" are images.
===================================================

Also, TensorFlow normally uses a ImageDataGenerator so I also don't know which data they take.

Also the terminology between "testing" and "validation" confuses me from time to time...

So let me get this straight, when people mention of using 80% as training data it includes the validation data comprising of (20%) for it train the model correct?

user posted image

So the model basically fine-tunes itself from the code with the training data of 80%, from the automatic splitting of the ImageDataGenerator?


Then the testing data is the data the model "never" seen before. And it is Feed into the model afterwards to see how well the model is performing?
Means, I should technically export out a "Model", then manually feed images into the Model to get the results from Testing and obtain the predicted accuracy etc value?
*
No problem lah, we learn together. Knowledge is meant to be shared.

You need to learn how to google doing programming

Unsure anything, just paste that code in google

https://stackoverflow.com/questions/5806426...in-and-test-set

The mnist is the dataset, so it contains a function called load_data()

So what this code does

CODE
def load_data(path='mnist.npz'):
   path = get_file(path, origin='https://s3.amazonaws.com/img-datasets/mnist.npz', file_hash='8a61469f7ea1b51cbae51d4f78837e45')
   with np.load(path, allow_pickle=True) as f:
       x_train, y_train = f['x_train'], f['y_train']
       x_test, y_test = f['x_test'], f['y_test']
   return (x_train, y_train), (x_test, y_test)


It separates out the dataset that has already been split for you

So your call (X_train, y_train), (X_test, y_test) = mnist.load_data()

Will automatically define the variables X_train .. y_test to the appropriate sets

In your case, you need to manually shuffle and subset your data

I am not sure exact code to do in python but one of the way you can do this is by

1. Define your images from 0-100 eg in COVID positive images
2. Randomly draw 80 numbers
3. Subset your image dataset based on the 80 numbers

I believe there should be a function that helps you to do this. Do your homework lol. Come back to me with the code and I'll check for you.

To answer your second question:

That method is kinda dated

We have what we called cross-validation/Out of bag sampling method

You can read up on it but it does not involved another separate hold out set which your limited dataset will suffer from



This post has been edited by pipedream: Jan 15 2021, 10:15 PM
TSiSean
post Jan 16 2021, 12:51 AM

iz old liao.
*******
Senior Member
4,496 posts

Joined: Jun 2011



QUOTE(pipedream @ Jan 15 2021, 10:11 PM)
No problem lah, we learn together. Knowledge is meant to be shared.

You need to learn how to google doing programming

Unsure anything, just paste that code in google

https://stackoverflow.com/questions/5806426...in-and-test-set

The mnist is the dataset, so it contains a function called load_data()

So what this code does

CODE
def load_data(path='mnist.npz'):
   path = get_file(path, origin='https://s3.amazonaws.com/img-datasets/mnist.npz', file_hash='8a61469f7ea1b51cbae51d4f78837e45')
   with np.load(path, allow_pickle=True) as f:
       x_train, y_train = f['x_train'], f['y_train']
       x_test, y_test = f['x_test'], f['y_test']
   return (x_train, y_train), (x_test, y_test)


It separates out the dataset that has already been split for you

So your call (X_train, y_train), (X_test, y_test) = mnist.load_data()

Will automatically define the variables X_train .. y_test to the appropriate sets

In your case, you need to manually shuffle and subset your data

I am not sure exact code to do in python but one of the way you can do this is by

1. Define your images from 0-100 eg in COVID positive images
2. Randomly draw 80 numbers
3. Subset your image dataset based on the 80 numbers

I believe there should be a function that helps you to do this. Do your homework lol. Come back to me with the code and I'll check for you.

To answer your second question:

That method is kinda dated

We have what we called cross-validation/Out of bag sampling method

You can read up on it but it does not involved another separate hold out set which your limited dataset will suffer from
*
[first part] I now remember that I did a few methods from guides on TF by Google previously, I just remembered I built this during July last year prior before taking my internship and putting behind these.

https://colab.research.google.com/drive/1MR...G3q?usp=sharing

Feel free to comment whether is methodology is okay?

This doesn't require to use the annoying (X_train, y_train), (X_test, y_test) method.

[second part] well tensorflow 2.0 guide still using it... so so fast dated ah the 80% "training & validation" part?
Well gotta read up the cross-validation part.

But I think it would be more helpful to better visualize is my dataset having issues by getting something like a gradient map in the tutorial mentioned earlier


pipedream
post Jan 16 2021, 01:25 AM

Look at all my stars!!
*******
Senior Member
2,353 posts

Joined: Dec 2006
QUOTE(iSean @ Jan 16 2021, 12:51 AM)
[first part] I now remember that I did a few methods from guides on TF by Google previously, I just remembered I built this during July last year prior before taking my internship and putting behind these.

https://colab.research.google.com/drive/1MR...G3q?usp=sharing

Feel free to comment whether is methodology is okay?

This doesn't require to use the annoying (X_train, y_train), (X_test, y_test) method.

[second part] well tensorflow 2.0 guide still using it... so so fast dated ah the 80% "training & validation" part?
Well gotta read up the cross-validation part.

But I think it would be more helpful to better visualize is my dataset having issues by getting something like a gradient map in the tutorial mentioned earlier


*
Not really dated lah just that I feel cross validation is suitable for small dataset like yours

Quick look through your code/script: Looks good, this is how keras is used

The thing you can play around is here

CODE
baseModel = VGG16(weights="imagenet", include_top=False,
input_tensor=Input(shape=(224, 224, 3)))
# construct the head of the model that will be placed on top of the
# the base model
headModel = baseModel.output
headModel = AveragePooling2D(pool_size=(4, 4))(headModel)
headModel = Flatten(name="flatten")(headModel)
headModel = Dense(64, activation="relu")(headModel)
headModel = Dropout(0.3)(headModel)
headModel = Dense(3, activation="softmax")(headModel)


See you are actually adding layers to the base model here, you can try changing the basemodel to the LW model then play around with the layers, the activation model, dimensionality etc)

I remember there a layer called Convnet that is specifically for image models

https://towardsdatascience.com/building-a-c...as-329fbbadc5f5

Try to play around with that as well.
E-Tan
post May 26 2021, 04:36 PM

Getting Started
**
Junior Member
137 posts

Joined: Mar 2013
I'm guessing your FYP project is already long done, but for others looking for imaging databases, I found The Cancer Imaging Archive pretty useful!

Looks like their Covid collection is growing
https://www.cancerimagingarchive.net/collections/

2 Pages  1 2 >Top
 

Change to:
| Lo-Fi Version
0.0214sec    0.49    5 queries    GZIP Disabled
Time is now: 25th November 2025 - 04:33 AM